管道API

与Python 的 transformers 库类似，Transformers.js 为用户提供了一种利用 Transformers 强大功能的一种简单方法。该pipeline()函数是使用预训练模型进行推理的最简单快捷的方式。

基础知识

首先创建一个实例，pipeline()并指定要用它来完成的任务。例如，要创建情感分析管道，您可以执行以下操作：

javascript

import { pipeline } from "@huggingface/transformers";

const classifier = await pipeline("sentiment-analysis");

首次运行时，程序pipeline会下载并缓存与任务关联的默认预训练模型。这可能需要一些时间，但后续调用速度会快得多。

默认情况下，模型将从Hugging Face Hub下载并存储在浏览器缓存中，但您也可以指定自定义模型和缓存位置。更多信息请参见此处。

现在，您可以通过调用函数的方式，将分类器应用于目标文本：

javascript

const result = await classifier("I love transformers!");
// [{'label': 'POSITIVE', 'score': 0.9998}]

如果有多个输入，可以将它们作为数组传递：

javascript

const result = await classifier([
  "I love transformers!",
  "I hate transformers!",
]);
// [{'label': 'POSITIVE', 'score': 0.9998}, {'label': 'NEGATIVE', 'score': 0.9982}]

您还可以通过将模型作为第二个参数传递给函数，来指定管道要使用的不同模型pipeline()。例如，要使用不同的模型进行情感分析（例如，训练用于预测评论情感（以 1 到 5 星之间的星级表示）的模型），您可以这样做：

javascript

const reviewer = await pipeline(
  "sentiment-analysis",
  "Xenova/bert-base-multilingual-uncased-sentiment",
);

const result = await reviewer(
  "The Shawshank Redemption is a true masterpiece of cinema.",
);
// [{label: '5 stars', score: 0.8167929649353027}]

Transformers.js 支持加载 Hugging Face Hub 上托管的任何模型，前提是该模型具有 ONNX 权重（位于名为 <model_name> 的子文件夹中onnx）。有关如何将 PyTorch、TensorFlow 或 JAX 模型转换为 ONNX 的更多信息，请参阅转换部分。

该pipeline()函数提供了一种快速使用预训练模型进行推理的绝佳方式，因为它会自动处理所有预处理和后处理工作。例如，如果您想使用 OpenAI 的 Whisper 模型执行自动语音识别 (ASR)，您可以这样做：

javascript

// Create a pipeline for Automatic Speech Recognition
const transcriber = await pipeline(
  "automatic-speech-recognition",
  "Xenova/whisper-small.en",
);

// Transcribe an audio file, loaded from a URL.
const result = await transcriber(
  "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac",
);
// {text: ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}

管道选项

加载（Loading）

我们提供多种选项来控制如何从 Hugging Face Hub（或本地）加载模型。默认情况下，在浏览器中运行时，会使用模型的量化版本，该版本体积更小、速度更快，但通常精度较低。要覆盖此行为（即使用未量化的模型），您可以将自定义 PretrainedOptions对象作为函数的第三个参数pipeline：

javascript

// Create a pipeline for feature extraction, using the full-precision model (fp32)
const pipe = await pipeline("feature-extraction", "Xenova/all-MiniLM-L6-v2", {
  dtype: "fp32",
});

请查看量化部分以了解更多信息。

您还可以通过传递参数来指定要使用的模型版本revision。由于 Hugging Face Hub 使用基于 Git 的版本控制系统，因此您可以使用任何有效的 Git 版本标识符（例如，分支名称或提交哈希值）。

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "Xenova/whisper-tiny.en",
  {
    revision: "output_attentions",
  },
);

运行（Running）

许多流程都提供可指定的附加选项。例如，在使用多语言翻译模型时，您可以像这样指定源语言和目标语言：

javascript

// Create a pipeline for translation
const translator = await pipeline(
  "translation",
  "Xenova/nllb-200-distilled-600M",
);

// Translate from English to Greek
const result = await translator("I like to walk my dog.", {
  src_lang: "eng_Latn",
  tgt_lang: "ell_Grek",
});
// [ { translation_text: 'Μου αρέσει να περπατάω το σκυλί μου.' } ]

// Translate back to English
const result2 = await translator(result[0].translation_text, {
  src_lang: "ell_Grek",
  tgt_lang: "eng_Latn",
});
// [ { translation_text: 'I like to walk my dog.' } ]

使用支持自回归生成的模型时，您可以指定生成参数，例如新标记的数量、采样方法、温度、重复惩罚等等。有关可用参数的完整列表，请参阅GenerationConfig类。

例如，要使用生成一首诗LaMini-Flan-T5-783M，您可以这样做：

javascript

// Create a pipeline for text2text-generation
const poet = await pipeline(
  "text2text-generation",
  "Xenova/LaMini-Flan-T5-783M",
);
const result = await poet("Write me a love poem about cheese.", {
  max_new_tokens: 200,
  temperature: 0.9,
  repetition_penalty: 2.0,
  no_repeat_ngram_size: 3,
});

result[0].generated_text控制台日志显示：

txt

奶酪啊奶酪！你真是完美的治愈系美食。
你的质地如此柔滑细腻，永远不会过时。
每一口都像黄油般入口
即化，让我感觉就像在家一样，享受着这份甜蜜的美味。

从经典到大胆的口味组合，
我也很喜欢你作为食材的多功能性！
切达奶酪是 我在任何场合或心情下的 首选；
它增添了层次感和丰富度，却不会单独掩盖食物本身的美味。

流式传输（Streaming）

某些管道（例如 <pipeline>text-generation或 automatic-speech-recognition<pipeline>）支持流式输出。这是通过 TextStreamer<class> 类实现的。例如，在使用类似 <chatmodel> 的聊天模型时Qwen2.5-Coder-0.5B-Instruct，您可以指定一个回调函数，该函数将针对每个生成的令牌文本调用（如果未设置，则新令牌将打印到控制台）。

javascript

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/Qwen2.5-Coder-0.5B-Instruct",
  { dtype: "q4" },
);

// Define the list of messages
const messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Write a quick sort algorithm." },
];

// Create text streamer
const streamer = new TextStreamer(generator.tokenizer, {
  skip_prompt: true,
  // Optionally, do something with the text (e.g., write to a textbox)
  // callback_function: (text) => { /* Do something with text */ },
});

// Generate a response
const result = await generator(messages, {
  max_new_tokens: 512,
  do_sample: false,
  streamer,
});

result[0].generated_text控制台日志显示：

以下是用 Python 实现的快速排序算法的简单示例：

python

def quick_sort(arr):
    如果 len(arr) <= 1：
        返回数组
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    返回 quick_sort(left) + middle + quick_sort(right)
# 用法示例：
arr = [3, 6, 8, 10, 1, 2]
sorted_arr = quick_sort(arr)
print(sorted_arr)

解释：
基本情况：如果数组元素小于或等于 1（即 len(arr) 小于或等于 1），则数组已排序，可以直接返回。
枢轴选择：枢轴被选为数组的中间元素。
分区：数组被分成三个部分：小于枢轴值的元素（“左”）、等于枢轴值的元素（“中”）和大于枢轴值的元素（“右”）。然后对这些分区进行递归排序。
递归排序：使用 quick_sort 对子数组进行递归排序。这种方法确保每次递归调用都将问题规模减少一半，直到达到基本情况。

此流式处理功能允许您在生成输出的同时进行处理，而不是等待整个输出生成完毕后再进行处理。

有关每个管道可用选项的更多信息，请参阅API 参考文档。如果您希望对推理过程进行更多控制，可以使用AutoModel<T>、<T> AutoTokenizer 或AutoProcessor<T> 类。

管道API

基础知识

管道选项

加载（Loading）

运行（Running）

流式传输（Streaming）

可用任务

任务

自然语言处理

管道API ​

基础知识 ​

管道选项 ​

加载 （Loading） ​

运行 （Running） ​

流式传输 （Streaming） ​

可用任务 ​

任务 ​

自然语言处理 ​

管道API

基础知识

管道选项

加载（Loading）

运行（Running）

流式传输（Streaming）

可用任务

任务

自然语言处理