> For the complete documentation index, see [llms.txt](https://karini-ai.gitbook.io/karini-ai-documentation/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://karini-ai.gitbook.io/karini-ai-documentation/observability.md). # Observability Every request in Karini AI includes a trace that outlines the steps orchestrated by the prompt, agent, recipe, or copilot. This trace allows you to follow the step-by-step process leading to the response at that point in the conversation.

Tracing Step	Prompt	Attributes
Detect Greeting Questions	Input : Greeting detection prompt with input question Output: Classification output	ServiceName- Information about the application in the resource SpanName - Internal function name gen_ai.prompt.0.role - gen_ai.completion.0.finish_reason - gen_ai.completion.0.role - gen_ai.openai.api_base - gen_ai.openai.system_fingerprint - gen_ai.request.max_tokens -The maximum number of response tokens requested gen_ai.request.model - The model requested (e.g. `gpt-4`, `claude`, etc.) gen_ai.request.temperature gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.completion_tokens - The number of tokens used for the completion response gen_ai.usage.prompt_tokens - The number of tokens used for the prompt in the request llm.headers - llm.is_streaming - llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens - The total number of tokens used
Check Content Safety	Input : User query Output : Content Safety Check Output	ServiceName - Information about the application in the resource SpanName - Internal function name
Query Embeddings	Input : User query Output : Vector embeddings of the user query	ServiceName - Information about the application in the resource SpanName - Internal function name gen_ai.openai.api_base - gen_ai.request.model - The model requested (e.g. `gpt-4`, `claude`, etc.) gen_ai.response.model - The model actually used (e.g. `gpt-4-0613`, etc.) gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request llm.headers - The headers used for the request llm.is_streaming - llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens - The total number of tokens used
Get similar embeddings	Input : User query Output : Similar embeddings from the vector store	ServiceName - Information about the application in the resource SpanName - Internal function name
Perform reranking	Input : User query Output : Reranked similar embeddings using Cohere reranker	ServiceName - Information about the application in the resource SpanName - Internal function name
Get Qna chain streaming	Input : Prompt, user query and the reranked context Output : Response from the LLM	ServiceName - Information about the application in the resource SpanName - Internal function name gen_ai.completion.0.finish_reason - gen_ai.completion.0.role - gen_ai.openai.api_base - gen_ai.openai.api_version - gen_ai.prompt.0.role gen_ai.request.max_tokens -The maximum number of response tokens requested gen_ai.request.model - The model requested (e.g. `gpt-4`, `claude`, etc.) gen_ai.request.temperature - gen_ai.response.model - The model actually used (e.g. `gpt-4-0613`, etc.) gen_ai.system - The vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.completion_tokens - The number of tokens used for the completion response gen_ai.usage.prompt_tokens - The number of tokens used for the prompt in the request llm.headers - llm.is_streaming - llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens - The total number of tokens used
Get Followup Questions	Input : Follow up question generation prompt, user query and LLM generated answer to the user query Output : Followup questions	ServiceName - Information about the application in the resource SpanName - Internal function name gen_ai.completion.0.finish_reason - gen_ai.completion.0.role - gen_ai.openai.api_base - gen_ai.openai.system_fingerprint gen_ai.openai.api_version - gen_ai.prompt.0.role - gen_ai.request.max_tokens -The maximum number of response tokens requested gen_ai.request.model -The model actually used (e.g. `gpt-4-0613`, etc.) gen_ai.request.temperature - gen_ai.response.model - gen_ai.system -The vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.completion_tokens -The number of tokens used for the completion response gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request llm.headers - llm.is_streaming - llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens
Agent Executor	Input : Prompt, user question, agent thoughts and actions Output: Response to the agent action	ServiceName - Information about the application in the resource SpanName - Internal function name gen_ai.prompt.0.role gen_ai.request.max_tokens - The maximum number of response tokens requested gen_ai.request.model -The model requested (e.g. `gpt-4`, `claude`, etc.) gen_ai.request.temperature gen_ai.system-he vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.completion_tokens -The number of tokens used for the completion response gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens - The total number of tokens used
Get LLM Chain Streaming	Input : Prompt with user query Output: Response from LLM	ServiceName - Information about the application in the resource SpanName - Internal function name gen_ai.prompt.0.role gen_ai.request.max_tokens - The maximum number of response tokens requested gen_ai.request.model -The model requested (e.g. `gpt-4`, `claude`, etc.) gen_ai.request.temperature gen_ai.system-he vendor of the LLM (e.g. OpenAI, Anthropic, etc.) gen_ai.usage.completion_tokens -The number of tokens used for the completion response gen_ai.usage.prompt_tokens -The number of tokens used for the prompt in the request llm.request.type - The type of request (e.g. `completion`, `chat`, etc.) llm.usage.total_tokens - The total number of tokens used