# Intelligent Document Processing

This hands-on lab guides you through creating an Intelligent Document Processing (IDP) system that automates extraction and processing using AI technologies. You'll build a complete workflow integrating OCR capabilities, natural language processing and webhook triggers.

## Use Case

Build an Intelligent Document Processing system powered by advanced AI services for automated document extraction and analysis.

## Prerequisites

* S3 bucket with sample documents pre-loaded.
* Webhook API access credentials are configured.

### Step 1: Build Document Extraction Prompt in Prompt Playground

1. Navigate to the Prompt Playground.
2. Click **Add new** in the top right corner to create a new prompt.
3. Open Prompt templates and select the **Po Extractor** template. This template will create a new summarization prompt.&#x20;
4. Rename your prompt as "**Document Extractor"**.
5. **Save** the prompt.

#### Test & Compare :

1. In Test & Compare tab, select different models from the dropdown to test the prompt \[Hint: Compare Claude Sonnet 3.7, Claude Sonnet 3.5 v2, Amazon Nova Pro].
2. Test and compare the prompt responses with the selected models and the guardrail.
3. Select the best-performing model as the **Primary Model**.
4. Optionally assign a **Fallback Model**.
5. Click the **Save prompt run** in the right corner to save the prompt run.
6. **Save** and **publish** the prompt.

### Step 2: Create Notification Agent Prompt

1. Return to Prompt Playground.
2. Click **Add new** in the top corner to add a new prompt.
3. Open Prompt templates and select the **Notify Agent** template.
4. Once the template is selected, increase the **Max State Updates** from **3** to **20** for complex reasoning.
5. In the agent input field,  use the following sample input for testing.&#x20;

```
{
    "po_number": "PO-2025-8975",
    "date": "20250312",
    "delivery_time": "10-14 Business Days",
    "delivery_mode": "Standard Freight",
    "payment_terms": "Net 30 Days",
    "supplier_ref": "SUPP-XYZ-7239",
    "project": "ABC Expansion 4",
    "pool": "General Procurement",
    "requestor": "Employee 64",
    "contact": "+1 555-987-654",
    "originator": "Procurement Team",
    "company": {
        "name": "XYZ ENGINEERING SOLUTIONS LTD.",
        "address": "789 Industrial Way, Houston, TX 77001, USA",
        "phone": "+1 234-567-890",
        "email": "info@xyzengineering.com"
    },
    "delivery_address": "ABC Manufacturing Inc., 321 Production Ave, Phoenix, AZ 85001, USA",
    "invoice_address": "XYZ Engineering Solutions Ltd., 789 Industr",
    "line_items": [
        {"line_no": 1, "item_no": "ITM0001", "description": "Industrial Equipment 861", "quantity": 8, "unit": "PCS", "unit_price": 47.00, "amount": 405.00},
       
    ],
    "grand_total": 405.00,
    "authorized_by": "",
    "received_by": ""
}

```

5. Name the prompt as **Notification Agent** and save the prompt.&#x20;

<mark style="color:$danger;">\[Hint: Please Save your Changes before you navigate away while adding new tools ]</mark>

6. Open your prompt and proceed to "**Tools**" tab to configure the following agent tools:&#x20;

* **Messaging Tool:**&#x20;
  * Click **Add new** to create a new tool.
  * Name: "Messaging tool".
  * Description: "Sends email notifications about document processing status".
  * Type: Messaging
  * Messaging type: Email
  * In the **Input Schema** section, use the following email schema and **save** the tool.

    <pre><code><strong>{
    </strong>  "$schema": "http://json-schema.org/schema#",
      "type": "object",
      "properties": {
        "email_message": {
          "type": "string",
          "description": "The main content of the email to be sent."
        },
        "subject": {
          "type": "string",
          "description": "The subject line of the email."
        },
        "recipient_email": {
          "type": "string",
          "description": "The email address of the recipient.",
          "default":"{{user_context.email}}"
        }
      },
      "required": [
        "email_message",
        "recipient_email",
        "subject"
      ]
    }
    </code></pre>
  * Enter the following credentials
    * Email
    * Password
    * SMTP Server
    * SMTP Port
    * Recipient email address
* Provide appropriate email subject and message body.

#### Test & Compare :

1. In Test & Compare tab, select different models from the dropdown to test the prompt&#x20;
2. Click on **Test** button below and compare the agent responses with the selected models.
3. Click on **Select as best answer** to select the best-performing model as the Primary Model.
4. Optionally **Select as best answer** to assign a Fallback Model.
5. Click the **Save prompt run** in the right corner to save the prompt run.
6. **Save** and **Publish** the prompt.

### Step 3: Create an Agent 2.0 Recipe

1. Navigate to the **Recipes** section.
2. Click **Add New** in the top right corner to create a new recipe.
3. Configure recipe details:
   1. Name: Intelligent Document Processing.
   2. Type: Agent 2.0.
4. Set up the the recipe workflow nodes by selecting each of the following elements and dragging them onto the recipe canvas:&#x20;
   1. Webhook:
      1. From the right side panel, Copy the **Webhook URL and Webhook Token** to your notepad for later API use.
      2. Select the **Query method** as POST.
   2. Connector:
      1. Select Amazon S3 as the data source.
      2. Connect the Webhook node to Connector node.
   3. Processing:
      1. Enable the OCR option and select Amazon Textract.
      2. Enable Extract Layouts and Extract Table.
      3. Connect the Connector to the Processing node.&#x20;
   4. Start:
      1. Connect the Processing node to the Start node.
   5. Prompt:
      1. Label: "Document Extractor".
      2. From the prompt dropdown, select the document extraction prompt created earlier.
      3. Scroll down and set the State settings:
         * Document Cache:  Enable Document Cache and select the Type as **Ephemeral**.
         * Messages:  Enable **Messages**, and select message context as **Last Message.**
      4. Connect the Start node to the Prompt node.
   6. Sink:
      1. Connect the Prompt Node to the Sink Node to store JSON output.
      2. Sink node Configuration:&#x20;
         1. S3 bucket path:&#x20;

            `s3://karini-ai-workshop/idp-sink/`
         2. File name pattern: (Ex. processed/{filename\_prefix}{current\_datetime}.json )  `po_processing/{filename_prefix}{current_datetime}.json.`
   7. &#x20;Agent:
      1. Label: "Notification Agent".
      2. From the agents dropdown, select your notification agent created earlier.&#x20;
      3. Scroll down and set the State settings:
         * Document Cache:  Enable Document Cache and select the Type as **Ephemeral**.
         * Messages:  Enable **Messages**, and select message context as **Last Message.**
         * Enable **metadata**.
      4. Connect the Document Extractor Prompt node to Notification Agent node.
   8. End:
      1. Connect the Notification Agent node to the End node.&#x20;
5. On the right side, set the **Number of state updates** to a higher value such as 75.&#x20;
6. Save and Publish the recipe .
7. By default, the V1 version is deployed once it is published.
8. For subsequent versions, use the **Actions** button and select **Deploy** to deploy the new version. \[Hint: Use Actions Button to select Deploy dropdown].
9. Test the recipe once published.&#x20;

Refer to the following video to create the recipe.

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F7ZrVuiAUMyuYVvrK5KaB%2Fuploads%2FSfL24YVhfgA2KUGzxRAK%2FIDP%20recipe%20V1.mp4?alt=media&token=a0638e94-a8c8-4e97-ae5a-ab11907d16e5>" %}

### Step 4: Trigger Workflow via API

1. Open the linux terminal in your computer. Use the following curl command to initiate document processing \[Hint: Ensure CURL command input payload is valid JSON]. Note the output of the command for next step.

```
curl -X 'POST' \
  '<Webhook URL from the Webhook Tile>' \
  -H 'accept: application/json' \
  -H 'x-api-token: <YOUR API TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "files": [
        {
            "content_type": "application/pdf",
            "file_content": "",
            "file_path": "<S3 path>"
        }
    ],
    "input_message": "Extract document metadata",
    "metadata": { "email": <YOUR EMAIL ID>}
}'

```

2. Retrieve webhook request status by ID:

```
curl -X 'GET' \
  '</api/webhook/request/{request_id}>' \
  -H 'accept: application/json' \
  -H 'x-api-token: <YOUR API TOKEN>'

```

3.View all webhooks for a specific recipe:

```
curl -X 'GET' \
  '</api/webhook/recipe/{recipe_id}>' \
  -H 'accept: application/json' \
  -H 'x-api-token: <YOUR API TOKEN>'

```

Refer the video to trigger workflow.

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F7ZrVuiAUMyuYVvrK5KaB%2Fuploads%2Fjr3OeWh6gKKDqyBB0HPk%2FWebhook%20trigger%20workflow%20test.mp4?alt=media&token=212fe06e-ac4f-4f13-bfbf-bf36338c833a>" %}

### Step 5: Monitor and Verify Results

1. Navigate to the recipe icon on the LHS toolbar
2. Click on the **Actions** button on the Intelligent Document Processing recipe.
3. Select **Webhook History.** Here,  you can review the history of all webhook requests along with their inputs, response, status, tokens and detailed traces.&#x20;
4. Additionally, you can access the S3 location of your Sink node and review JSON output.

This IDP workflow automates document processing from ingestion through extraction to notification, creating a complete intelligent document handling system.

Refer the video for reviewing the webhook history.

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F7ZrVuiAUMyuYVvrK5KaB%2Fuploads%2FVEbtjS1YqCImDfoQy3vW%2FIDP%20webhook%20history.mp4?alt=media&token=85f00e4c-74eb-493a-aaff-1332927f6e7d>" %}

<br>