Quick Start
Step 1: Select Model
Start by selecting the model you want to use from the dropdown menu in the Configuration Panel on the left side of the interface.
Step 2: Enter Your Prompt
In the Prompt Editor (center panel), type your question or instruction. For your first test, try one of these examples:
Simple question: "What is machine learning?"
Technical query: "Explain the difference between supervised and unsupervised learning."
Task-based: "Write a Python function that calculates the factorial of a number."
Step 3: Run and Review
Click the Run Inference button in the top-right corner of the Prompt Editor.
What to Expect
Initial delay: You will see a processing indicator. This is the Time to First Token (TTFT) - typically 1-10 seconds depending on model size and prompt length.
Streaming response: The model's answer will appear progressively in the Output panel below the prompt editor.
Performance metrics: Once complete, you will see two key metrics at the bottom:
Tokens Generated: The number of tokens in the response (roughly 0.75 tokens per word).
TTFT: Time to First Token - how long it took before the model started responding.
Last updated