The new ‘AI scientist’ can write scientific papers without human input. This is why it is a problem

 

Scientific discovery is one of humanity’s greatest achievements. First, scientists need to understand the existing data and identify key differences. Then, they should create a research question and create and try to find an answer. Then, they must analyze and interpret the results of the experiment, which may lead to another research question.

Can these complex processes be automated? Last week, Sakana AI Labs announced the creation of an «AI scientist» – an artificial intelligence system that it claims can make scientific discoveries in the field of machine learning in a systematic way.

Using large-scale languages ​​(LLMs) such as those behind ChatGPT and other AI chatbots, the system can think critically, select promising ideas, new algorithms, plot results, and write a summary of the experiment and its findings, complete with references. Sakana says the AI ​​tool can create a lifetime of scientific experiments for US$15 per paper – less than the price of a scientist’s lunch.

These are big claims. Do they stack? And even if they do, would an army of AI scientists churning out research papers at inhumanly fast speeds be good news for science?

How computers can do ‘science’

Most science is done in the open, and almost all scientific knowledge is written down somewhere (or we would have no way of knowing). Millions of scientific papers are freely available online in sites such as arXiv and PubMed.

LLMs taught in this field capture the language of science and its practices. So perhaps it’s not surprising that an LLM graduate can produce something that looks like a good scientific paper – it’s absorbed a lot of examples that it can copy.

What is not clear is whether an AI system can create it interest scientific paper. Of course, good science requires novelty.

But is it fun?

Scientists don’t want to be told things that are already known. Instead, they want to learn new things, especially new things that are completely different from what is already known. This requires consideration of the size and value of the donation.

The Sakana system attempts to deal with interest in two ways. First, it «finds» new paper ideas to compare with existing research (listed in the Semantic Scholar repository). Anything too similar is discarded.

Second, the Sakana system introduces a step of «peer review» – using another LLM to judge the quality and novelty of the produced paper. Again, there are many examples of peer review online at sites such as openreview.net that can guide you in how to critique a paper. LLMs have absorbed this, too.

AI can be a poor judge of AI results

Reviews are mixed on Sakana AI releases. Some have described it as creating «permanent scientific decline».

Even a systematic review of its output judges the paper to be very weak. This should get better as technology advances, but the question of whether scientific papers are self-published is important.

The ability of LLMs to judge the quality of research is an open question. My own work (soon to be published in Research Synthesis Methods) shows that LLMs are not the best at detecting the risk of bias in medical research, although this may change over time.

Deny the automated system that has emerged in computational research, which is easier than other types of science that require physical experiments. Sakana’s experiments are done with code, which is also created as text that LLM can be trained to create.

AI tools to help scientists, not replace them

AI researchers have been developing ways to support science for years. Given the volume of published research, even finding publications related to a particular scientific question can be difficult.

Special research tools use AI to help scientists discover and develop existing applications. These include the aforementioned Semantic Scholar, as well as newer systems such as Elicit, Research Rabbit, scite and Consensus.

Mining tools like PubTator dig deep into papers to identify key concepts, such as genetic mutations and diseases, and their structural relationships. This is especially useful for organizing and organizing scientific information.

Machine learning has also been used to support the synthesis and analysis of clinical evidence, in tools such as Robot Reviewer. A summary that compares and contrasts the claims in the papers from Scholarcy helps to review the literature.

All of these tools are intended to help scientists do their jobs better, not to replace them.

AI research can expand existing challenges

While Sakana AI says it doesn’t see the role of human scientists diminishing, the company’s vision of «AI-driven science» could have a major impact on science.

One concern is that, if AI-based papers flood the scientific literature, future AI systems may be taught about AI output and collapse. This means that they may not be able to innovate.

However, the definition of science extends only to the scientific practices of AI.

There are already bad actors in science, including «paper mills» that produce fake papers. This problem will only get worse if a scientific paper can be produced with US$15 and an unknown amount of original information.

The need to check for errors in a mountain of automated research can overwhelm the capabilities of real scientists. The peer review process is clearly already broken, and throwing more questionable research into the system isn’t going to fix it.

Science is based on trust. Scientists emphasize the integrity of the scientific method so that we can be sure that our understanding of the world (and now, the world’s machinery) is valid and correct.

The scientific environment in which AI systems are relevant raises important questions about the meaning and value of this process, and how much confidence we should have in AI scientists. Is this the kind of scientific environment we want?

#scientist #write #scientific #papers #human #input #problem

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Scroll al inicio