← Back to contracts

Cross-Cutting Improvements:Enhancing Findability and Accessibility of Scientific Data with Large Language Models and Vector Indexing

US NSF grant open #nsf-2531903

Summary

Scientific research across domains is increasingly driven by large-scale, complex datasets produced by simulations, experiments, and observations. While the volume and variety of data continue to grow, researchers still face major challenges in discovering, understanding, and accessing the data most relevant to their work. This project aims to establish a new data curation infrastructure by integrating Large Language Models (LLMs) and vector indexing into the data life cycle: improving metadata, enhancing discoverability, and optimizing data storage. Existing data management systems primar

Cross-Cutting Improvements:Enhancing Finda…
Onboard