Even hard mode isn't that hard because GPT-2 tends to ramble on while saying nothing substantive. If I can't figure out what a paper is supposed to be talking about, it's fake.
Did way better on Hard mode than Easy. I think people get bored of doing this before we can see real indicative results.
Scores under 5 on what amounts to a coin flip doesn't strike me as so remarkable, especially when coupled with an incentivised reporting-bias as we see here. ("I got a high score! Proud to share!" Vs. "I got a low score, or an even score and look at all the people reporting high scores, think I might keep it to myself")
Being as it is, at this juncture, I think the AI may still have a chance to be strong with this one.
Also, were the AI to do well consistently, I'd think it might say more about the external unfamiliarity with, and the internal prevalence of, field-specific scientific jargon, than any AI's or human's innate intelligence.
The engineering/materials science/physics ones were fairly easy to identify for me. Usually it would be one or two sentences that were grammatically cohesive but would make a statement that didn't make any sense if you had even a basic understanding of the topic. One that stood out to me was an astrophysics paper that said a planet was orbiting solar wind. I don't have to be a PhD to know that's BS.
Yes, this mirrors my experience. Fields that have my interest are pretty easy in isolation (just looking at one subject), but fields that are remote can be a challenge.
In this instance, the medical and biotech generated stuff are much harder to identify because the algorithm doesn't need to introduce grammar issues. For instance, here is a random paper abstract that I changed, can you spot the change? Hint, it is one of the Greek symbols or a number.
Three highly pathogenic β-coronaviruses have crossed the animal-to-human species barrier in the past two decades: SARS-CoV, MERS-CoV and SARS-CoV-2. To evaluate the possibility of identifying antibodies with broad neutralizing activity, we isolated a monoclonal antibody, termed B4, that cross-reacts with eight β-coronavirus spike glycoproteins, including all five human-infecting β-coronaviruses.
Same for me, hard mode is quite easy, our brain is pattern matching Engine and constantly try to make sense of this and hence it may look like legitimate, it wouldn’t if text were 2D.
4/4 on hard. Never read a Nature paper before.