# LLMs (inference)

> \[!TIP]
>
> Location for concrete implementations within the framework `Hive-agent-framework/adapters`.
>
> Location for base abstraction within the framework `Hive-agent-framework/llms`.

A Large Language Model (LLM) is an AI designed to understand and generate human-like text. Trained on extensive text data, LLMs learn language patterns, grammar, context, and basic reasoning to perform tasks like text completion, translation, summarization, and answering questions.

To unify differences between various APIs, the framework defines a common interface—a set of actions that can be performed with it.

## Providers (adapters)

| Name                                                                       | LLM                        | Chat LLM                                      | Structured output (constrained decoding) |
| -------------------------------------------------------------------------- | -------------------------- | --------------------------------------------- | ---------------------------------------- |
| `WatsonX`                                                                  | ✅                          | ⚠️ (model specific template must be provided) | ❌                                        |
| `Ollama`                                                                   | ✅                          | ✅                                             | ⚠️ (JSON only)                           |
| `OpenAI`                                                                   | ❌                          | ✅                                             | ⚠️ (JSON schema only)                    |
| `Azure OpenAI`                                                             | ❌                          | ✅                                             | ⚠️ (JSON schema only)                    |
| `LangChain`                                                                | ⚠️ (depends on a provider) | ⚠️ (depends on a provider)                    | ❌                                        |
| `Groq`                                                                     | ❌                          | ✅                                             | ⚠️ (JSON object only)                    |
| `AWS Bedrock`                                                              | ❌                          | ✅                                             | ⚠️ (JSON only) - model specific          |
| `VertexAI`                                                                 | ✅                          | ✅                                             | ⚠️ (JSON only)                           |
| `BAM (Internal)`                                                           | ✅                          | ⚠️ (model specific template must be provided) | ✅                                        |
| ➕ [Request](https://github.com/i-am-Hive/Hive-agent-framework/discussions) |                            |                                               |                                          |

All providers' examples can be found in examples/llms/providers.

Are you interested in creating your own adapter? Jump to the [adding a new provider](broken://pages/pg3nVMLyIXvD9sc43cgo) section.

## Usage

### Plain text generation

```ts
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { WatsonXLLM } from "Hive-agent-framework/adapters/watsonx/llm";

const llm = new WatsonXLLM({
  modelId: "google/flan-ul2",
  projectId: process.env.WATSONX_PROJECT_ID,
  apiKey: process.env.WATSONX_API_KEY,
  region: process.env.WATSONX_REGION, // (optional) default is us-south
  parameters: {
    decoding_method: "greedy",
    max_new_tokens: 50,
  },
});

const reader = createConsoleReader();
const prompt = await reader.prompt();
const response = await llm.generate(prompt);
reader.write(`LLM 🤖 (text) : `, response.getTextContent());
reader.close();
```

*Source: examples/llms/text.ts*

> \[!NOTE]
>
> The `generate` method returns a class that extends the base `BaseLLMOutput` class. This class allows you to retrieve the response as text using the `getTextContent` method and other useful metadata.

> \[!TIP]
>
> You can enable streaming communication (internally) by passing `{ stream: true }` as a second parameter to the `generate` method.

### Chat text generation

```ts
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "Hive-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "Hive-agent-framework/adapters/ollama/chat";

const llm = new OllamaChatLLM();

const reader = createConsoleReader();

for await (const { prompt } of reader) {
  const response = await llm.generate([
    BaseMessage.of({
      role: Role.USER,
      text: prompt,
    }),
  ]);
  reader.write(`LLM 🤖 (txt) : `, response.getTextContent());
  reader.write(`LLM 🤖 (raw) : `, JSON.stringify(response.finalResult));
}
```

*Source: examples/llms/chat.ts*

> \[!NOTE]
>
> The `generate` method returns a class that extends the base `ChatLLMOutput` class. This class allows you to retrieve the response as text using the `getTextContent` method and other useful metadata. To retrieve all messages (chunks) access the `messages` property (getter).

> \[!TIP]
>
> You can enable streaming communication (internally) by passing `{ stream: true }` as a second parameter to the `generate` method.

#### Streaming

```ts
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "Hive-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "Hive-agent-framework/adapters/ollama/chat";

const llm = new OllamaChatLLM();

const reader = createConsoleReader();

for await (const { prompt } of reader) {
  for await (const chunk of llm.stream([
    BaseMessage.of({
      role: Role.USER,
      text: prompt,
    }),
  ])) {
    reader.write(`LLM 🤖 (txt) : `, chunk.getTextContent());
    reader.write(`LLM 🤖 (raw) : `, JSON.stringify(chunk.finalResult));
  }
}
```

*Source: examples/llms/chatStream.ts*

#### Callback (Emitter)

```ts
import "dotenv/config.js";
import { createConsoleReader } from "examples/helpers/io.js";
import { BaseMessage, Role } from "Hive-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "Hive-agent-framework/adapters/ollama/chat";

const llm = new OllamaChatLLM();

const reader = createConsoleReader();

for await (const { prompt } of reader) {
  const response = await llm
    .generate(
      [
        BaseMessage.of({
          role: Role.USER,
          text: prompt,
        }),
      ],
      {},
    )
    .observe((emitter) =>
      emitter.match("*", (data, event) => {
        reader.write(`LLM 🤖 (event: ${event.name})`, JSON.stringify(data));

        // if you want to close the stream prematurely, just uncomment the following line
        // callbacks.abort()
      }),
    );

  reader.write(`LLM 🤖 (txt) : `, response.getTextContent());
  reader.write(`LLM 🤖 (raw) : `, JSON.stringify(response.finalResult));
}
```

*Source: examples/llms/chatCallback.ts*

### Structured generation

```ts
import "dotenv/config.js";
import { z } from "zod";
import { BaseMessage, Role } from "Hive-agent-framework/llms/primitives/message";
import { OllamaChatLLM } from "Hive-agent-framework/adapters/ollama/chat";
import { JsonDriver } from "Hive-agent-framework/llms/drivers/json";

const llm = new OllamaChatLLM();
const driver = new JsonDriver(llm);
const response = await driver.generate(
  z.union([
    z.object({
      firstName: z.string().min(1),
      lastName: z.string().min(1),
      address: z.string(),
      age: z.number().int().min(1),
      hobby: z.string(),
    }),
    z.object({
      error: z.string(),
    }),
  ]),
  [
    BaseMessage.of({
      role: Role.USER,
      text: "Generate a profile of a citizen of Europe.",
    }),
  ],
);
console.info(response);
```

*Source: examples/llms/structured.ts*

## Adding a new provider (adapter)

To use an inference provider that is not mentioned in our providers list feel free to [create a request](https://github.com/i-am-Hive/Hive-agent-framework/discussions).

If approved and you want to create it on your own, you must do the following things. Let's assume the name of your provider is `Custom.`

* Base location within the framework: `Hive-agent-framework/adapters/custom`
  * Text LLM (filename): `llm.ts` (example implementation)
  * Chat LLM (filename): `chat.ts` (example implementation)

> \[!IMPORTANT]
>
> If the target provider provides an SDK, use it.

> \[!IMPORTANT]
>
> All provider-related dependencies (if any) must be included in `devDependencies` and `peerDependencies` in the `package.json`.

> \[!TIP]
>
> To simplify work with the target RestAPI feel free to use the helper `RestfulClient` class. The client usage can be seen in the WatsonX LLM Adapter here.

> \[!TIP]
>
> Parsing environment variables should be done via helper functions (`parseEnv` / `hasEnv` / `getEnv`) that can be found here.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://hive-4.gitbook.io/hive/modules/llms-inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
