Bài thực hành Promptfoo cơ bản

Dưới đây là bài thực hành Promptfoo cơ bản để giúp bạn bắt đầu đánh giá và so sánh các prompt sử dụng mô hình ngôn ngữ (LLM). promptfoo rất hữu ích trong việc tối ưu prompt, benchmark nhiều model khác nhau, và kiểm thử các thay đổi trong ứng dụng sử dụng LLM.

MỤC TIÊU

Sử dụng Promptfoo để benchmark các prompt.

So sánh output giữa nhiều mô hình như gpt-3.5-turbo và openrouter/mistral.

Tùy chỉnh tiêu chí đánh giá output.

Chạy test bằng CLI và hiển thị kết quả.

BƯỚC 1: CÀI ĐẶT PROMPTFOO

npm install -g promptfoo

BƯỚC 2: TẠO CẤU TRÚC THƯ MỤC DỰ ÁN

mkdir promptfoo-demo && cd promptfoo-demo
touch config.yaml
mkdir evals

BƯỚC 3: TẠO TẬP TIN CẤU HÌNH config.yaml

prompts:
  - "Translate to French: {{input}}"
  - "Please provide the French translation for: {{input}}"

providers:
  - id: openai:gpt-3.5-turbo
    config:
      apiKey: $OPENAI_API_KEY
  - id: openrouter:mistralai/mistral-7b-instruct
    config:
      apiKey: $OPENROUTER_API_KEY

tests:
  - vars:
      input: "Hello, how are you?"
    assert:
      includes:
        - "Bonjour"

  - vars:
      input: "I am going to the market."
    assert:
      includes:
        - "marché"

Bạn cần lấy API Key từ OpenAI và OpenRouter.

BƯỚC 4: CHẠY ĐÁNH GIÁ

Chạy benchmark từ CLI:

promptfoo eval config.yaml

Sau đó, hiển thị dashboard kết quả:

promptfoo web

Mở trình duyệt tại địa chỉ: http://localhost:3000 để xem kết quả trực quan.