ModelFusion

Build AI applications, chatbots, and agents with JavaScript and TypeScript.

README

ModelFusion


Build AI applications, chatbots, and agents with JavaScript and TypeScript.


[!NOTE]

ModelFusion is in its initial development phase. Until version 1.0 there may be breaking changes, because I am still exploring the API design. Feedback and suggestions are welcome.


Introduction


ModelFusion is a library for building AI apps, chatbots, and agents. It provides abstractions for AI models, vector indices, and tools.

- Type inference and validation: ModelFusion uses TypeScript and Zod to infer types wherever possible and to validate model responses.
- Flexibility and control: AI application development can be complex and unique to each project. With ModelFusion, you have complete control over the prompts and model settings, and you can access the raw responses from the models quickly to build what you need.
- No chains and predefined prompts: Use the concepts provided by JavaScript (variables, functions, etc.) and explicit prompts to build applications you can easily understand and control. Not black magic.
- More than LLMs: ModelFusion supports other models, e.g., text-to-image and voice-to-text, to help you build rich AI applications that go beyond just text.
- Integrated support features: Essential features like logging, retries, throttling, tracing, and error handling are built-in, helping you focus more on building your application.

Quick Install


  1. ```sh
  2. npm install modelfusion
  3. ```

You need to install zod and a matching version of zod-to-json-schema (peer dependencies):

  1. ```sh
  2. npm install zod zod-to-json-schema
  3. ```

Usage Examples


You can provide API keys for the different integrations using environment variables (e.g.,OPENAI_API_KEY) or pass them into the model constructors as options.


Generate text using a language model and a prompt.
You can stream the text if it is supported by the model.
You can use prompt mappings to change the prompt format of a model.

generateText


  1. ```ts
  2. const text = await generateText(
  3.   new OpenAITextGenerationModel({ model: "text-davinci-003" }),
  4.   "Write a short story about a robot learning to love:\n\n"
  5. );
  6. ```

streamText


  1. ```ts
  2. const textStream = await streamText(
  3.   new OpenAIChatModel({ model: "gpt-3.5-turbo", maxTokens: 1000 }),
  4.   [
  5.     OpenAIChatMessage.system("You are a story writer."),
  6.     OpenAIChatMessage.user("Write a story about a robot learning to love"),
  7.   ]
  8. );

  9. for await (const textFragment of textStream) {
  10.   process.stdout.write(textFragment);
  11. }
  12. ```

Prompt Mapping


Prompt mapping lets you use higher level prompt structures (such as instruction or chat prompts) for different models.

  1. ```ts
  2. const text = await generateText(
  3.   new LlamaCppTextGenerationModel({
  4.     contextWindowSize: 4096, // Llama 2 context window size
  5.     nPredict: 1000,
  6.   }).mapPrompt(InstructionToLlama2PromptMapping()),
  7.   {
  8.     system: "You are a story writer.",
  9.     instruction: "Write a short story about a robot learning to love.",
  10.   }
  11. );
  12. ```

  1. ```ts
  2. const textStream = await streamText(
  3.   new OpenAIChatModel({
  4.     model: "gpt-3.5-turbo",
  5.   }).mapPrompt(ChatToOpenAIChatPromptMapping()),
  6.   [
  7.     { system: "You are a celebrated poet." },
  8.     { user: "Write a short story about a robot learning to love." },
  9.     { ai: "Once upon a time, there was a robot who learned to love." },
  10.     { user: "That's a great start!" },
  11.   ]
  12. );
  13. ```

Metadata and original responses


ModelFusion model functions return rich results that include the original response and metadata when you set the fullResponse option to true.

  1. ```ts
  2. // access the full response and the metadata:
  3. // the response type is specific to the model that's being used
  4. const { response, metadata } = await generateText(
  5.   new OpenAITextGenerationModel({
  6.     model: "text-davinci-003",
  7.     maxTokens: 1000,
  8.     n: 2, // generate 2 completions
  9.   }),
  10.   "Write a short story about a robot learning to love:\n\n",
  11.   { fullResponse: true }
  12. );

  13. for (const choice of response.choices) {
  14.   console.log(choice.text);
  15. }

  16. console.log(`Duration: ${metadata.durationInMs}ms`);
  17. ```


Generate JSON value that matches a schema.

  1. ```ts
  2. const value = await generateJson(
  3.   new OpenAIChatModel({
  4.     model: "gpt-3.5-turbo",
  5.     temperature: 0,
  6.     maxTokens: 50,
  7.   }),
  8.   {
  9.     name: "sentiment" as const,
  10.     description: "Write the sentiment analysis",
  11.     schema: z.object({
  12.       sentiment: z
  13.         .enum(["positive", "neutral", "negative"])
  14.         .describe("Sentiment."),
  15.     }),
  16.   },
  17.   OpenAIChatFunctionPrompt.forSchemaCurried([
  18.     OpenAIChatMessage.system(
  19.       "You are a sentiment evaluator. " +
  20.         "Analyze the sentiment of the following product review:"
  21.     ),
  22.     OpenAIChatMessage.user(
  23.       "After I opened the package, I was met by a very unpleasant smell " +
  24.         "that did not disappear even after washing. Never again!"
  25.     ),
  26.   ])
  27. );
  28. ```


Generate JSON (or text as a fallback) using a prompt and multiple schemas.
It either matches one of the schemas or is text reponse.

  1. ```ts
  2. const { schema, value, text } = await generateJsonOrText(
  3.   new OpenAIChatModel({ model: "gpt-3.5-turbo", maxTokens: 1000 }),
  4.   [
  5.     {
  6.       name: "getCurrentWeather" as const, // mark 'as const' for type inference
  7.       description: "Get the current weather in a given location",
  8.       schema: z.object({
  9.         location: z
  10.           .string()
  11.           .describe("The city and state, e.g. San Francisco, CA"),
  12.         unit: z.enum(["celsius", "fahrenheit"]).optional(),
  13.       }),
  14.     },
  15.     {
  16.       name: "getContactInformation" as const,
  17.       description: "Get the contact information for a given person",
  18.       schema: z.object({
  19.         name: z.string().describe("The name of the person"),
  20.       }),
  21.     },
  22.   ],
  23.   OpenAIChatFunctionPrompt.forSchemasCurried([OpenAIChatMessage.user(query)])
  24. );
  25. ```


Tools are functions that can be executed by an AI model. They are useful for building chatbots and agents.

Create Tool


A tool is a function with a name, a description, and a schema for the input parameters.

  1. ```ts
  2. const calculator = new Tool({
  3.   name: "calculator" as const, // mark 'as const' for type inference
  4.   description: "Execute a calculation",

  5.   inputSchema: z.object({
  6.     a: z.number().describe("The first number."),
  7.     b: z.number().describe("The second number."),
  8.     operator: z.enum(["+", "-", "*", "/"]).describe("The operator."),
  9.   }),

  10.   execute: async ({ a, b, operator }) => {
  11.     switch (operator) {
  12.       case "+":
  13.         return a + b;
  14.       case "-":
  15.         return a - b;
  16.       case "*":
  17.         return a * b;
  18.       case "/":
  19.         return a / b;
  20.       default:
  21.         throw new Error(`Unknown operator: ${operator}`);
  22.     }
  23.   },
  24. });
  25. ```

useTool


The model determines the parameters for the tool from the prompt and then executes it.

  1. ```ts
  2. const { tool, parameters, result } = await useTool(
  3.   new OpenAIChatModel({ model: "gpt-3.5-turbo" }),
  4.   calculator,
  5.   OpenAIChatFunctionPrompt.forToolCurried([
  6.     OpenAIChatMessage.user("What's fourteen times twelve?"),
  7.   ])
  8. );
  9. ```

useToolOrGenerateText


The model determines which tool to use and its parameters from the prompt and then executes it.
Text is generated as a fallback.

  1. ```ts
  2. const { tool, parameters, result, text } = await useToolOrGenerateText(
  3.   new OpenAIChatModel({ model: "gpt-3.5-turbo" }),
  4.   [calculator /* ... */],
  5.   OpenAIChatFunctionPrompt.forToolsCurried([
  6.     OpenAIChatMessage.user("What's fourteen times twelve?"),
  7.   ])
  8. );
  9. ```


Turn audio (voice) into text.

  1. ```ts
  2. const transcription = await transcribe(
  3.   new OpenAITranscriptionModel({ model: "whisper-1" }),
  4.   {
  5.     type: "mp3",
  6.     data: await fs.promises.readFile("data/test.mp3"),
  7.   }
  8. );
  9. ```


Generate a base64-encoded image from a prompt.

  1. ```ts
  2. const image = await generateImage(
  3.   new OpenAIImageGenerationModel({ size: "512x512" }),
  4.   "the wicked witch of the west in the style of early 19th century painting"
  5. );
  6. ```


Create embeddings for text. Embeddings are vectors that represent the meaning of the text.

  1. ```ts
  2. const embeddings = await embedTexts(
  3.   new OpenAITextEmbeddingModel({ model: "text-embedding-ada-002" }),
  4.   [
  5.     "At first, Nox didn't know what to do with the pup.",
  6.     "He keenly observed and absorbed everything around him, from the birds in the sky to the trees in the forest.",
  7.   ]
  8. );
  9. ```


Split text into tokens and reconstruct the text from tokens.

  1. ```ts
  2. const tokenizer = new TikTokenTokenizer({ model: "gpt-4" });

  3. const text = "At first, Nox didn't know what to do with the pup.";

  4. const tokenCount = await countTokens(tokenizer, text);

  5. const tokens = await tokenizer.tokenize(text);
  6. const tokensAndTokenTexts = await tokenizer.tokenizeWithTexts(text);
  7. const reconstructedText = await tokenizer.detokenize(tokens);
  8. ```


  1. ```ts
  2. const texts = [
  3.   "A rainbow is an optical phenomenon that can occur under certain meteorological conditions.",
  4.   "It is caused by refraction, internal reflection and dispersion of light in water droplets resulting in a continuous spectrum of light appearing in the sky.",
  5.   // ...
  6. ];

  7. const vectorIndex = new MemoryVectorIndex<TextChunk>();
  8. const embeddingModel = new OpenAITextEmbeddingModel({
  9.   model: "text-embedding-ada-002",
  10. });

  11. // update an index - usually done as part of an ingestion process:
  12. await upsertTextChunks({
  13.   vectorIndex,
  14.   embeddingModel,
  15.   chunks: texts.map((text) => ({ text })),
  16. });

  17. // retrieve text chunks from the vector index - usually done at query time:
  18. const { chunks } = await retrieveTextChunks(
  19.   new SimilarTextChunksFromVectorIndexRetriever({
  20.     vectorIndex,
  21.     embeddingModel,
  22.     maxResults: 3,
  23.     similarityThreshold: 0.8,
  24.   }),
  25.   "rainbow and water droplets"
  26. );
  27. ```

Features


- Summarize text
  - Call recording
- Utilities
  - Error handling

Integrations


Vector Indices



Observability



Prompt Formats


Use higher level prompts that are mapped into model specific prompt formats.

PromptInstructionChat
------------------------------------------
OpenAI
Llama
Alpaca
Vicuna
Generic

Documentation



More Examples



Examples for the individual functions and objects.


_Terminal app_, _chat_, _llama.cpp_



_Next.js app_, _OpenAI GPT-3.5-turbo_, _streaming_, _abort handling_


A web chat with an AI assistant, implemented as a Next.js app.


_terminal app_, _PDF parsing_, _in memory vector indices_, _retrieval augmented generation_, _hypothetical document embedding_


Ask questions about a PDF document and get answers from the document.


_Next.js app_, _Stability AI image generation_


Create an 19th century painting image for your input.


_Next.js app_, _OpenAI Whisper_


Record audio with push-to-talk and transcribe it using Whisper, implemented as a Next.js app. The app shows a list of the transcriptions.


_terminal app_, _agent_, _BabyAGI_


TypeScript implementation of the BabyAGI classic and BabyBeeAGI.


_terminal app_, _agent_, _tools_, _GPT-4_


Small agent that solves middle school math problems. It uses a calculator tool to solve the problems.


_terminal app_, _PDF parsing_, _recursive information extraction_, _in memory vector index, \_style example retrieval_, _OpenAI GPT-4_, _cost calculation_


Extracts information about a topic from a PDF and writes a tweet in your own style about it.