Quick Start

Step 1: Select Model

Start by selecting the model you want to use from the dropdown menu in the Configuration Panel on the left side of the interface.

Step 2: Enter Your Prompt

In the Prompt Editor (center panel), type your question or instruction. For your first test, try one of these examples:

  • Simple question: "What is machine learning?"

  • Technical query: "Explain the difference between supervised and unsupervised learning."

  • Task-based: "Write a Python function that calculates the factorial of a number."

Step 3: Run and Review

Click the Run Inference button in the top-right corner of the Prompt Editor.

What to Expect

  • Initial delay: You will see a processing indicator. This is the Time to First Token (TTFT) - typically 1-10 seconds depending on model size and prompt length.

  • Streaming response: The model's answer will appear progressively in the Output panel below the prompt editor.

  • Performance metrics: Once complete, you will see two key metrics at the bottom:

    • Tokens Generated: The number of tokens in the response (roughly 0.75 tokens per word).

    • TTFT: Time to First Token - how long it took before the model started responding.

Last updated