Literature review for prompt engineering of ChatGPT.pptx

Extracting accurate materials
data from research papers with
conversational language models
and prompt engineering

Background
• Growing effort to replace manual extraction with automated
extraction.
• It still requires large effort, expertise and coding.
• ChatExtract can fully automate very accurate data extraction with
minimal initial effort and background.
• Close to 90% of precision.
• Databases for critical cooling rates of metallic glasses and yield
strength are developed.
• Prompt engineering has now become a standard practice in the field
image generation.

The data extraction – two main stages
• 1. Initial classification with simple prompt, clear out all the sentences that
do not contain data.
• 2. A series of prompts that control the data extraction are categorized.
• 2.1. Split data into single and multi valued (single entry are more likely to
be extracted properly)
• 2.2. Include the possibility that some data may be missing from the text.
• 2.3. Use uncertainty-inducing prompts to let model re-analyze the text.
• 2.4. Embed all the questions in a single conversation.
• 2.5. Enforce a strict Yes or No to reduce uncertainty.

Single or multiple values
• Texts with only a single value are much simpler.
• Texts with multiple values need a careful analysis of the relations
between words to determine the correspondance.
• Material properties: Material, Value, Unit

• Then the text is analyzed.
• For a single-valued text, can directly ask questions about the data and ask for separations.
• If nagative answer is given, the text is discarded and no data is extracted.
• For multi-valued sentence, asking the model to provide structured data in a form of a
table.
• Repetitively provide the text with each prompt.
• The repetition helps in maintaining all the details about the text that is being analyzed.
• Enforcing Yes or No will enable the automation of the data extraction process.

1. Blue boxes represent prompts given to the model.
2. Gray boxes are instructions to the user, ‘Yes’, ‘No’, ‘None’.
3. The bold text in ‘[ ]’ are to be replaced with appropriate
values of the named item: material, value or unit.

• Investigated the performance of ChatExtract approach on multiple
property examples: bulk modulus, metallic glass critical cooling rate,
high entropy alloy yield stress
• The reason of choosing bulk modulus: very often report other elastic
properties, such as Young’s modulus or shear modulus with similar
name and range of values.
• Other measurements have similar units as bulk modulus.

1. Single-value sentences have higher precisions and recalls
than multi-valued sentences for the same models.
2. Two core features of ChatGPT that can be in ChatExtract
2.1. Use of redundant prompts that introduce the possibility of
uncertainty about the previous extracted data.
2.2. The information about previous prompts and answers is
kept. This will allow the follow-up questions to relate to the
entire conversation.
3. Removing the follow-up questions will decrease the overall
precision to just 42.7% and 26.5% for ChatGPT-4 and ChatGPT-
3.5, from values of 90.8% and 70.1%.
4. Other models: LLaMA2-chat, ChemDataExtractor2(CDE2)
have been tested as well.

Application to tables and figures
• Data can be found in tables and figures.
• Analyzing figures is still an ongoing challenge.
• For table, the same method as in the general
ChatExtract workflow.
• For figure, only the figure caption is used in the
classification. If positive, figure can be downloaded
for later manual data extraction.
• The precision is very high for extracting tables (98%)
• Assessment of accuracy for figure classification is
more difficult
• For example, 436 figures, manually classified 45
containing bulk modulus data. (80% precision)
• Some figures requires expertise to extract data

Results of real-life data extraction
• Critical cooling rates of metallic glasses
• Databases presented in raw, cleaned, and standardized forms via manual post-processing and
machine learning.
• Despite challenges, ChatExtract demonstrated reasonable precision and recall in data extraction.
• Final standardized database contained 557 datapoints, with duplicates retained.
• A separate database focusing on metallic materials generated, containing 298 unique datapoints.
• ChatExtract showed efficiency compared to manual methods, with consistent performance.

Results of real-life data extraction
• Yield strength of high entropy alloys
• Extracted 10269 raw data points, resulting in 8900 cleaned datapoints.
• Data ranged from 12 MPa to 19.16 GPa, with a peak around 400 MPa.
• Extracted 2456 datapoints from tables and classified 1848 figures as relevant.
• ChatExtract's general and transferable approach proves effective for diverse data extraction tasks.
• Proposed expansions to ChatExtract workflow to handle additional constraints or data types.
• Assessing accuracy of expanded ChatExtract functionalities would require further refinement and
testing.

Conclusions
• ChatGPT extracts high-quality materials data from research texts with engineered prompts.
• Achieved over 90% precision and 87.7% recall on bulk modulus data, and 91.6% precision and
83.6% recall on critical cooling rates.
• Success attributed to purposeful redundancy, uncertainty, and information retention within
conversation.
• Developed two databases: critical cooling rates for metallic glasses and yield strengths for high
entropy alloys.
• Quality of data and simplicity suggest potential to replace labor-intensive methods.
• ChatExtract's independence suggests potential for improvement with newer LLMs.

Literature review for prompt engineering of ChatGPT.pptx

Recommended

Recommended

More Related Content

Similar to Literature review for prompt engineering of ChatGPT.pptx

Similar to Literature review for prompt engineering of ChatGPT.pptx (20)

Recently uploaded

Recently uploaded (20)

Literature review for prompt engineering of ChatGPT.pptx