Generative AI-Mediated Knowledge Management Tool

From Salish Sea Wiki



Using Generative Artificial Intelligence to Increase Access to Place-Based Knowledge

Project Initiation and Completion Dates: July 1, 2025 to June 30, 2027

Applicant: Deschutes Estuary Restoration Team

Category 2 - Link to RFP - Link to Budget

ABSTRACT - This project applies Generative Artificial Intelligence (GenAI) and Retrieval Augmented Generation (RAG) to synthesize and evaluate scientific evidence and policy in support of ecosystem recovery, using the Deschutes River Watershed at the southern terminus of Puget Sound, as a case study. Overwhelmed by extensive documentation, local agencies struggle to integrate and utilize ecological knowledge effectively. This project aims to develop a digital archive on the Salish Sea Restoration Platform, where a RAG/LLM (large language model) interface will allow any user to analyze thousands of pages efficiently, improving decision-making and knowledge accessibility. We will aggregate and categorize watershed documents, empower users with AI-driven analysis tools, train technical experts, refine the interface, broaden community engagement, and share findings regionally. The approach leverages existing infrastructure to create an AI-enhanced knowledge repository, facilitating rapid synthesis and interdisciplinary insights. A technical advisory group and public workshops will ensure iterative improvement and accessibility. This initiative challenges the assumption that ecosystem stewardship is limited by the existence of knowledge, proposing instead that knowledge distribution and application are key barriers. Special attention is given to communities that currently struggle with information processing, exploring the use of AI tools to evaluate compliance with policy requirements and planning goals. Ultimately, the project seeks to transform HOW stakeholders interact with environmental knowledge, fostering more informed decision-making and increasing community participation.

Background and problem statement[edit]

This project aims to apply emerging generative artificial intelligence technologies (GenAI) to the synthesis and evaluation of scientific evidence and policy to support ecosystems recovery and enhance social processes. We will develop these new capabilities through a case-study in the Deschutes River Watershed. These methods can then be replicated throughout other Puget Sound river basins. The Deschutes River runs through the state capital of Olympia, and for the past 20 years, local teams have produced assessments, monitoring reports, action plans, and project summaries considering habitat, stream flow, water quality, pollution, forestry, climate change, shorelines, endangered species, and stormwater.

This mountain of paper easily exceeds 10,000 pages in our one little watershed. No individual can even report on the scope of this work. We keep generating and reformulating new documents hoping this leads to wise action. While our expert knowledge is buried in documents, very few people have time to read them all. We have overwhelmed ourselves with information. Large Language Models (LLM) are a kind of GenAI that can be instructed to generate text mimicking natural writing in response to a prompt. Currently generated text is based on the content of the entire internet. When applied to ecological knowledge, the resulting text is imprecise, and often inaccurate.

Retrieval Augmented Generation (RAG) uses a constrained “source of knowledge” to guide a Large Language Model’s outputs. This approach leverages the language production of a full LLM, but to report on a smaller body of identified knowledge. Imagine a shared intern that can read 10,000 pages in five minutes, remembers everything they have read, and can summarize, report, and cite with startling accuracy, while responding to transparent prompts that shape their motives and structures their outputs. RAG allows this “super intern” to be available 24 hours bringing the cost of document review and synthesis to near zero.

Unfortunately our reliable sources of knowledge are disorganized, and scattered on hard drives and agency servers, or behind the paywalls of scientific publications. The Salish Sea Restoration Platform is the only public-facing open knowledge archive that can support unmediated interagency document management and interrogation. It has been developed through a state and federal partnership with the Society for Ecological Restoration to support all conservation professionals across institutions. Registered users can upload and categorize public documents from any source. Through this project we will add a RAG-LLM interface to The Platform. And test that interface by assembling all known reliable knowledge sources within a single watershed (see attachment), as well as governing legal authorities.

The potential effects of a flood of plausible misinformation make the management of knowledge critical–we are already seeing deliberate AI-supported misinformation in public comment processes (NOAA pers comms). Most people do not realize that public agencies responsible for protecting ecosystems are in a de facto arms race in the management of knowledge. The genie is out of the bottle. The other edge of this knife is the potential to rapidly increase expertise in ecosystem management across a broad community of practice by organizing evidence. Without intervention, reliable ecosystem knowledge has become specialized, inaccessible and expensive to retrieve. GenAI in a RAG/LLM framework can be used to rapidly and accurately synthesize and analyze detailed knowledge from trusted sources.

Objectives and Outcomes[edit]

This is a Category 2 proposal, which will conduct an innovative synthesis of scientific products to foster enhanced understanding and analysis. This award will result in a broadly accessible tool that can support future work in any watershed at a low cost. This effort has both predictable and highly unpredictable outcomes and effects, because it fundamentally changes a community's relationship with place-based and ecosystem-based information.

This project will develop a shared archive of place-based information that can be interrogated using a RAG/LLM interface accessible by registered users. We will aggregate and archive all available documents that describe monitoring, assessment, analysis, or planning in the Deschutes Watershed. These resources will be loaded into the Salish Sea Restoration Platform, where they will be categorized and publicly available for review and annotation by participants. We will assemble a volunteer Technical Advisory Group of representatives from institutions engaged in local restoration, protection and monitoring. This Technical Advisory Group will be supported in interrogating assembled documents using the RAG/LLM interface, combined with manual fact checking. This experience will inform the improvement of the interface, with a focus on developing tools to support high-quality “prompt engineering” (methods for using a LLM to generate reliable text). We will then provide a series of public engagement and empowerment events. We will document the strengths, weaknesses, opportunities and threats of the tool and share them regionally.

In summary, we aim to achieve the following objectives and outcomes:

  • Aggregate place-based documentation into a searchable archive–we will build a collection of documents describing the Deschutes Watershed on the Salish Sea Platform, using a backdoor mass-upload capability that can turn an annotated bibliography and a stack of PDFs into a set of wiki pages that can then be annotated
  • Empower rapid analysis of that archive using GenAI–we will develop a web-based RAG/LLM interface available to registered users on the Salish Sea Platform. This will include specific tools to support prompt engineering to support scientific analysis of ecological topics.
  • Train technical experts to interrogate place-based knowledge–we will generate a watershed-based cadre of individuals with the ability to interrogate all written knowledge in the watershed, and publish their findings.
  • Improve the interface–based on iterative work by experts we will improve the interface to maximize the quality of LLM outputs.
  • Broaden community expertise in ecosystem management–we will host community workshops that allow any member of the community to interrogate available evidence.
  • Share with all watersheds–we will generate a report describing our methods and findings, and make the technology available to all watersheds in the Salish Sea.

We expect and hope that the outcomes of this work will be unpredictable–because the existing management of ecosystem knowledge is limited to rare experts at high cost and this project will decrease the cost of accurate and synoptic knowledge retrieval . Ecosystem management knowledge is currently limited to a small body of experts. Even these experts lack synoptic understanding of the authorities, plans, observations, and risks we face as described in a large body of analytical work. It can take years for a new employee in the ecosystem industry to become conversant in the legal basis and written record of the conservation industry.

RAG/LLM is among the most stable and robust applications of GenAI. Confining LLM generation to a specific body of work can be used to complete a well cited synthesis, as well as gap analysis of available evidence, and consideration of relative importance of different factors and issues. Under an GenAI-augmented system of knowledge management using a shared archive, ignorance is no longer a plausible position for someone involved in ecosystem management decisionmaking. The average citizen can rapidly develop a robust line of inquiry. Any public official can very rapidly develop a relatively well-informed opinion. We then depend on developing methods of sound and transparent thinking, that match our new ability to access information.

Approach[edit]

Our technical approach is simple and direct, leveraging off-the-shelf and existing resources. We leverage a decade of development on the Salish Sea Restoration Platform. The Platform uses the same software (MediaWiki) that runs Wikipedia. We will develop a new component of the platform that pre-digests PDF text into a vector database. A Retrieval Augmented Generation (RAG) utility will then deliver semantically related text retrieved from this database to an existing LLM (e.g. ChatGPT 4 plus) to generate outputs based on engineered prompts. To build a knowledge based for the Deschutes, we will work with Lead Entity/watershed resource inventory area (WRIA) coordination bodies, the Local Implementing Organization (South Sound Alliance for Ecosystem Health), and agency program staff, and GenAI-driven web-crawlers to build a single shared archive of Deschutes Watershed knowledge. The Salish Sea Platform supports the ability to automatically load an annotated bibliography and associated PDF documents into the archive using a QA/QC protocol. These documents can then be selected through queries, and interrogated or compared in groups, or consulted en masse by a LLM.

Prompt engineering is the critical skill for using an LLM to produce synoptic and analytical text. The RAG/LLM interface design will provide users with prompt resources that increase the value of generated text, training the GenAI in norms of scientific thinking and argument. LLM tools are not broadly used in ecosystem management except by a few early adopters. We will train a cadre of technical experts within the Deschutes Watershed to use GenAI to analyze a shared document archive. Participants will have the opportunity to privately experiment between group sessions, and will be challenged to generate original research using the LLM. We can regroup and evaluate outputs, findings and theories. We will develop and document strategies for prompt chaining, decomposition, self-critique, contextualization, framework use, and reverse engineering to provoke useful LLM analysis.

Our technical advisory group will be designed to support interdisciplinary thinking and fact checking, with the aim of beginning to consider integrated watershed planning. Existing analyses are usually conducted in disciplinary silos. This platform will allow rapid assessment of interactions among and between disciplines. Platform development will be led by NOAA, the Salish Sea Restoration Platform working group, and their contractor WikiWorks. Technical and Public engagement events will be led by the Deschutes Estuary Restoration Team and their contractors.

Engagement plan[edit]

This project is designed to empower both technical experts and the public by improving access to and application of ecosystem knowledge. We will engage two key groups: (1) technical knowledge holders, and (2) policymakers and the communities they serve. Our approach ensures that the project is shaped by those who will use it, increasing its impact on decision-making and ecosystem stewardship. We will use the Salish Sea Restoration Platform as the foundation for engagement, integrating GenAI tools to facilitate knowledge-sharing. The project will involve: Technical Advisory Group (TAG): This opt-in forum will include representatives from local, state, federal, tribal, and NGO organizations involved in ecosystem management. TAG members will: Identify key documents to include in the knowledge archive.

Test the GenAI interface to evaluate its ability to retrieve and synthesize relevant knowledge. Provide feedback on strengths, weaknesses, and areas for improvement.

TAG will be recruited through the Local Integrating Organization (LIO) network, ensuring broad representation across stakeholders. Public Engagement Workshops: These events will introduce the GenAI tool to policymakers, local officials, and the public. Participants will receive basic instruction in prompt engineering and will have opportunities to explore how AI-mediated knowledge retrieval can enhance decision-making. The workshops aim to:

  • Bridge the gap between technical expertise and public understanding.
  • Improve communication between policymakers and the communities they serve.
  • Encourage new ways of applying ecosystem knowledge to real-world issues.

Equity and Inclusion[edit]

The Squaxin Island Tribe, Nisqually Tribe, and Chehalis Tribe play a critical role in monitoring how government and private actions affect treaty-protected resources. This tool will be available to tribal governments and members, allowing for rapid analysis of new proposals against existing ecosystem management plans and legal frameworks. By increasing efficiency and effectiveness, it will help tribes protect and restore treaty rights. Tribal representatives will be invited to join the Technical Community Review.

Recognizing that tribal governments often act as watchdogs in environmental oversight, this project will ensure that GenAI tools support their efforts to review proposals, raise concerns, and link those concerns to legal requirements and environmental risks.

Addressing Key Challenges[edit]

This project assumes that knowledge is a limiting factor in ecosystem stewardship. However, other challenges include the uneven distribution of knowledge and cultural norms that prioritize ecosystem exploitation over conservation. We are designing the tool to address these issues by:

  • Expanding access to expert knowledge for all stakeholders.
  • Reducing information silos between agencies, governments, and the public.
  • Empowering local decision-makers with precise, relevant information.

Project Outcomes[edit]

All engagement efforts will contribute to the final deployment of an open-source, online resource supporting ecosystem management in the Deschutes Watershed. Findings from this project will also provide a framework for expanding the tool to other watersheds. A final fact sheet summarizing results will be widely shared to inform future applications of GenAI in environmental stewardship.