Science in the age of large language models (LLMs)

6 min readNov 14, 2023

Introduction

Large language models (LLMs) are a type of artificial intelligence (AI) that can generate and understand text. They are trained on massive datasets of text and code and can be used for a variety of tasks, including writing, translating, and answering questions.

LLMs have the potential to revolutionize science in a number of ways. For example, they can be used to:

Generate new hypotheses and research ideas
Analyze large datasets of scientific data
Write scientific papers and reports
Communicate scientific findings to the public

However, there are also some potential risks associated with the use of LLMs in science. For example, LLMs can:

Generate false or misleading information
Produce biased or discriminatory results
Be used to create deep fakes or other forms of disinformation

Major Stakeholders:

Here are some of the major companies that are developing and using large language models (LLMs):

Google: Google AI has developed a number of LLMs, including Bard, LaMDA, and PaLM. Google AI is using LLMs for a variety of tasks, including generating text, translating languages, and answering questions. Google AI is also making some of its LLMs available to researchers and developers through the Google AI Platform.
OpenAI: OpenAI is a non-profit research laboratory that is developing and using LLMs. OpenAI has developed a number of LLMs, including GPT-2, GPT-3, and Jurassic-1 Jumbo. OpenAI is using LLMs for a variety of tasks, including generating text, translating languages, and writing different kinds of creative content. OpenAI is also making some of its LLMs available to researchers and developers through the OpenAI API.
Meta: Meta (formerly Facebook) has developed a number of LLMs, including BART, RoBERTa, and Galactica. Meta is using LLMs for a variety of tasks, including improving the quality of its social media platforms, developing new AI products and services, and conducting research in artificial intelligence.
Microsoft & Nvidia: Microsoft has developed a number of LLMs, including Turing NLG, Megatron-Turing NLG, and GPT-NeoX. Microsoft & Nvidia is using LLMs for a variety of tasks, including improving the quality of its Office products, developing new AI products and services, and conducting research in artificial intelligence.
Amazon: Amazon has developed a number of LLMs, including Comprehend, and Amazon CodeWhisperer. Amazon is using LLMs for a variety of tasks, including improving the quality of its cloud computing services, developing new AI products and services, and conducting research in artificial intelligence.

In addition to these major companies, there are a number of other startups and research laboratories that are developing and using LLMs.

Here are some examples of how companies are using LLMs:

Google is using LLMs to improve the quality of its search results. LLMs can be used to generate more relevant and informative snippets for search results.
OpenAI is using LLMs to develop new AI products and services. For example, OpenAI is developing an LLM-powered chatbot that can be used for customer service.
Meta is using LLMs to translate languages on its social media platforms. This makes it easier for people from different countries to communicate with each other.
Microsoft is using LLMs to improve the quality of its Office products. For example, Microsoft is using LLMs to develop a new feature for Word that can automatically generate summaries of documents.
Amazon is using LLMs to improve the quality of its cloud computing services. For example, Amazon is using LLMs to develop a new service that can automatically detect and fix errors in code.

LLMs are still under development, but they have the potential to revolutionize a wide range of industries. As LLMs continue to improve, we can expect to see them used for even more innovative and groundbreaking applications.

Here are some specific examples of how different LLM models developed by these companies are useful in scientific research:

Google AI’s LaMDA (Language Model for Dialogue Applications) has been used to generate and evaluate hypotheses about the structure of proteins, which could help scientists develop new drugs and treatments for diseases.
OpenAI’s GPT-3 has been used to develop new methods for detecting and correcting errors in scientific papers, which could help improve the quality and reliability of scientific research.
Meta’s BART (Bidirectional and Auto-Regressive Transformers) has been used to develop new algorithms for translating scientific literature from one language to another, which could help scientists break down language barriers and collaborate with researchers from around the world.
Microsoft’s Turing NLG has been used to generate summaries of scientific papers, which could help scientists keep up with the latest research in their field.
Amazon’s Comprehend has been used to extract key information from scientific papers, such as the research question, methods, results, and conclusions, which could help scientists save time and improve the efficiency of their research.

Here are some more specific examples of how LLMs are being used in different scientific fields:

In biology: LLMs are being used to generate hypotheses about the structure and function of proteins, to identify new drug targets, and to develop personalized treatments for cancer.
In chemistry: LLMs are being used to design new materials and catalysts, to predict the properties of molecules, and to simulate chemical reactions.
In physics: LLMs are being used to develop new theories of physics, to analyze data from experiments, and to predict the behavior of physical systems.
In computer science: LLMs are being used to develop new programming languages, improve the performance of machine learning algorithms, and create new forms of user interfaces.
In linguistics: LLMs are being used to study the structure and evolution of language, to develop new methods for machine translation, and to improve the accuracy of speech recognition systems.

LLMs are still under development, but they have the potential to revolutionize scientific research by helping scientists generate new hypotheses, analyze data, and develop new theories.

In addition to the specific examples above, LLMs can also be used in scientific research to:

Automate tasks such as literature reviews and data analysis, freeing up scientists’ time to focus on more creative and strategic work.
Help scientists to communicate their findings to a wider audience, by generating summaries of scientific papers in plain language or by creating new forms of scientific visualization.
Facilitate collaboration between scientists from different disciplines, by providing a common language for them to communicate with each other.

Overall, LLMs have the potential to make scientific research more efficient, effective, and accessible to everyone.

Concerns about the use of LLMs in science

Here are some of the specific concerns about the use of LLMs in science:

LLMs may not capture nuanced value judgments implicit in scientific writings. LLMs can provide useful general summaries of some scientific texts. Still, they may not be able to capture the uncertainties, limitations, and nuances of research that are obvious to human scientists. This could lead to misinterpretations of study results.
LLMs have been known to generate non-existent and false content. This phenomenon has been dubbed “hallucination.” For example, Meta’s Galactica, an LLM that was initially designed to reason about scientific knowledge, was reported to exhibit significant flaws such as reproducing biases and presenting falsehoods as facts. This means that LLMs should not be relied upon solely for tasks such as writing literature reviews or generating scientific reports.
The use of LLMs in the peer-review process can endanger trust in it.
LLMs used for writing peer-review reports run the risk of misinterpreting the submitted scientific article, be it by a loss of crucial information or by a hallucination. And whereas one can hold human reviewers responsible, it is a nontrivial question of how to hold LLMs responsible — partly owing to their opaque nature.

Who bears the responsibility?

It is important to remember that science is a human enterprise, and LLMs are just tools. Even if the limitations of LLMs can be solved, it would be a grave error to treat them as scientists who can produce science. Knowledge implies responsibility and is never detached from the scientist that produces it.

As we rush to deploy LLMs into scientific practices, it is important to be aware of the potential risks and to take steps to mitigate them. This includes:

Carefully evaluate the output of LLMs to ensure that it is accurate and reliable.
Not relying solely on LLMs for important tasks such as writing scientific papers or generating peer-review reports.
Developing clear guidelines and best practices for the use of LLMs in science.

Conclusion

LLMs have the potential to revolutionize science in a number of ways. However, it is important to be aware of the potential risks associated with their use and to take steps to mitigate them. By doing so, we can ensure that LLMs are used to benefit science, rather than harm it.

References

Large Language Models: A Survey: https://arxiv.org/abs/2303.18223
https://xcorr.net/2022/05/30/large-language-models-will-change-science/ by David Leslie (2023)
https://link.springer.com/article/10.1007/s10439-023-03297-9 (2023)
https://www.nature.com/articles/s42254-023-00581-4
https://techcrunch.com/2023/05/11/ai2-is-developing-a-large-language-model-optimized-for-science/