Amazon S3 manifest

This is a sample manifest, showcasing the structure with references to documents stored in Amazon S3 and associated metadata for each entry.

  1. Manifest file with source_ref and metadata

[
  {
    "source_ref": "s3://examplebucket/blogs/example.com-How to Optimize AWS Lambda for Scalability.pdf",
    "metadata": {
      "user_name": "Alice Cooper"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-Introduction to Amazon S3 Glacier for Data Archiving.pdf",
    "metadata": {
      "user_name": "Bob Johnson"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-AWS EC2 Instance Types and Choosing the Right One.pdf",
    "metadata": {
      "user_name": "Charlie Daniels"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-Amazon DynamoDB: Best Practices and Common Pitfalls.pdf",
    "metadata": {
      "user_name": "Diana Smith"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-Deep Dive into Amazon RDS Performance Tuning.pdf",
    "metadata": {
      "user_name": "Edward White"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-Amazon ElasticSearch Service: What You Need to Know.pdf",
    "metadata": {
      "user_name": "Fiona Green"
    }
  },
  {
    "source_ref": "s3://examplebucket/blogs/example.com-Optimizing CloudWatch Monitoring for AWS Environments.pdf",
    "metadata": {
      "user_name": "George Harris"
    }
  }
]
  • source_ref: A reference to the file stored in an S3 bucket, typically a URL or path pointing to the file.

  • metadata: An object containing metadata information, such as the user's name who uploaded the file or is associated with it.

    • user_name: The name of the user associated with this particular document or data entry.

  1. Manifest file with sorce_ref, processed_source_ref and metadata

  • source_ref: The S3 path or URL to the original unprocessed document.

  • processed_source_ref: The S3 path or URL to the processed version of the document (usually in JSON format).

  • metadata: Contains additional information about the document such as the user's name, experience, and the source of the document.

    • name: The name of the person associated with the document.

    • experience: The professional experience of the person, typically in "years_months" format.

    • source: The category or source of the document (e.g., "cybersecurity-analyst-resumes").

Last updated