Quick Start

Step 1: Select Model

Start by selecting the model you want to use from the dropdown menu in the Configuration Panel on the left side of the interface.

In the Prompt Editor (center panel), type your question or instruction. For your first test, try one of these examples:

Simple question: "What is machine learning?"
Technical query: "Explain the difference between supervised and unsupervised learning."
Task-based: "Write a Python function that calculates the factorial of a number."

Click the Run Inference button in the top-right corner of the Prompt Editor.

Initial delay: You will see a processing indicator. This is the Time to First Token (TTFT) - typically 1-10 seconds depending on model size and prompt length.
Streaming response: The model's answer will appear progressively in the Output panel below the prompt editor.
Performance metrics: Once complete, you will see two key metrics at the bottom:
- Tokens Generated: The number of tokens in the response (roughly 0.75 tokens per word).
- TTFT: Time to First Token - how long it took before the model started responding.

Last updated 19 hours ago