Run Graph RAG Recipe

Upon clicking the Run button in Graph RAG recipe, a pop-up will appear as shown below.

The Batch Run feature allows users to run the recipe in two distinct modes: Dry Run and Full Run. These modes give users flexibility in how they wish to process their dataset, ensuring efficient testing and execution of their recipes.

Dry Run Mode: The Dry Run mode executes the recipe on a minimal number of documents from the dataset. This mode is ideal for testing and validating the recipe's logic and functionality without committing to full-scale execution.
Full Run Mode: In the Full Run mode, the recipe is executed on all the documents in the dataset, carrying out the full processing as defined in the recipe.

Execution of Graph RAG Recipe with Neo4j

Upon execution of the Graph RAG Recipe, Export Cypher Command button is visible as shown in the image enabling users to retrieve and execute the generated Cypher commands directly within the Neo4j user interface.

These Cypher commands create two primary types of indexes to enhance search capabilities within the graph database:

Full-Text Indexes: These indexes facilitate efficient keyword-based searches across textual properties, significantly enhancing the performance and precision of text-based queries within the graph database.
Vector Indexes: Vector indexes are configured with specific dimensions and similarity functions, enabling advanced semantic search. They allow for similarity comparisons on vectorized data representations, thereby supporting more complex queries.

By leveraging both Full-Text and Vector Indexes, the system ensures optimized search operations for both traditional and AI-enhanced queries, delivering fast, accurate, and relevant data retrieval in Neo4j.

When the Graph RAG recipe is executed using Neo4j, the detailed processing status is displayed in real time in the right panel of the recipe interface. This provides clear visibility into each stage of the pipeline, allowing users to monitor progress and troubleshoot effectively. The status includes the following key details for each task:

Task: The specific action being performed (e.g., List objects, Process objects, Knowledge graph processing).
Start Time: The exact time when each task begins execution, helping users track the overall timeline.
Duration: The total time taken for each task to complete, providing insights into the efficiency of the recipe.
State: Indicates the current status of the task (e.g., running, completed, or failed), offering clarity on task progress.
Success Count: The number of successful instances or operations completed during the task, helping users assess progress.
Failure Count: The number of failed instances or operations during task execution, enabling quick identification of issues.
Status Message: Offers detailed messages about the task’s outcome, including success confirmation or error descriptions.

The key tasks executed during the recipe run include:

List objects: This step identifies and retrieves all input data objects from the specified data source, ensuring the necessary data is available for further processing.
Process objects: Parses and prepares data objects for downstream processing and analysis.
Extract metadata & chunk: Relevant metadata is extracted, and the content is divided into manageable chunks to facilitate efficient handling and analysis of large datasets.
Knowledge graph processing: Entities are analyzed and linked to construct a structured knowledge graph, which is stored in Neo4j for efficient querying and exploration.

By displaying these details for each task, users can monitor the execution flow, quickly identify any issues or failures, and ensure each step in the pipeline completes successfully, allowing for seamless operation from data extraction to knowledge graph creation in Neo4j.

The image below displays the status shown in the recipe.

Execution of Graph RAG Recipe with Neptune

When the Graph RAG recipe is executed using Neptune, the processing status is displayed in real-time within the right panel of the recipe interface. This provides detailed visibility into each step of the pipeline, allowing users to efficiently monitor progress and troubleshoot any issues that arise throughout the execution process. The status includes the following key details for each task:

Task: The specific action being performed (e.g., List objects, Process objects, Metadata extraction).
Start Time: The exact time when each task begins execution, providing a clear timeline of the overall workflow.
Duration: The total time taken for each task to complete, allowing users to assess the efficiency of the recipe.
State: Indicates the current status of each task (e.g., running, completed, failed), helping users understand the overall progress.
Success Count: The number of successful instances or operations performed during the task, helping users track progress.
Failure Count: The number of failed instances or operations, assisting users in identifying any issues that may have occurred.
Status Message: Provides detailed messages for each task, explaining the outcome (whether successful or failed) and offering insights into any errors or retries.

The key tasks executed during the recipe run include:

List objects: Identifies and retrieves all input data objects from the specified data source.
Process objects : Parses and prepares data objects for downstream processing and analysis.
Extract metadata & chunk : Extracts metadata and splits content into smaller, structured chunks for efficient handling.
Metadata extraction : Derives key metadata fields from content to enable categorization and search.
Metadata summarization : Generates concise summaries of metadata to support rapid content understanding.
Knowledge graph processing : Analyzes and structures entities and relationships to build a knowledge graph.
Loading knowledge graph to Neptune : Uploads the constructed knowledge graph into Neptune for graph-based querying.
Creating OpenSearch indices : Creates searchable indices in OpenSearch to enable fast and accurate information retrieval.

By displaying these details for each task, users can monitor execution flow, quickly identify issues or failures, and ensure that each step in the pipeline completes successfully.

Processing Metrics on the Recipe Dashboard

The following processing metrics are displayed on the recipe dashboard, providing an overview of the tasks in the data ingestion pipeline.

X-axis: Processing Tasks

OCR (Optical Character Recognition): Extraction of text from images or scanned documents.
PII (Personally Identifiable Information): Identification and handling of sensitive personal data such as names, addresses, and social security numbers.
Metadata Extraction Chunking: A combined task where relevant metadata is extracted, and content is divided into smaller, structured chunks for efficient analysis and processing.

Y-axis: Count of Processed Items

If errors occur during the recipe run, error messages are displayed in the recipe panel and can also be visualized as error counts in the dashboard.

View Run

To review the details of a specific recipe execution, follow the steps outlined below:

Navigate to the Recipe: Access the recipe for which you wish to view the execution details.
Click on the Action Button: Select the Action button associated with the recipe run.Choose '
View Run': From the dropdown options, select View Run to access the detailed execution status.
Select the Specific Run : Choose the desired run from the list of available executions.
Review the Execution Summary: Once the run is selected, review the summary, which includes detailed information such as connectors, embedded items, and chunks processed during the run.

Refer to the image below, which illustrates the recipe run.

PreviousGraph Descriptor NextGraph RAG Agent

Last updated 4 months ago

hashtagExecution of Graph RAG Recipe with Neo4j

hashtagExecution of Graph RAG Recipe with Neptune

hashtagProcessing Metrics on the Recipe Dashboard

hashtagView Run

Execution of Graph RAG Recipe with Neo4j

Execution of Graph RAG Recipe with Neptune

Processing Metrics on the Recipe Dashboard

View Run