Prompt Editor

The Prompt Editor is your primary interface for interacting with the model. This section explains how to use it effectively and what to expect during execution.

Entering Your Query

The prompt editor accepts your input query or instruction. This is what the model will respond to, operating within the constraints defined by your system prompt and configuration parameters.

Effective Prompting Techniques

Be clear and specific: Instead of 'Tell me about machine learning,' try 'Explain the difference between supervised and unsupervised machine learning, with examples of each.'
Provide context when needed: If your question builds on domain-specific knowledge, include that context. For example, 'Given a neural network with 3 hidden layers...' establishes the scenario.
Break down complex requests: If you need multi-step reasoning, explicitly outline the steps: 'First, explain X. Then, describe how Y relates to X. Finally, compare both to Z.'
Specify output constraints: If you need a particular format, state it explicitly: 'Provide your answer in a numbered list' or 'Respond in valid JSON format.'

Running Inference

When you click the Run Inference button, the system initiates the model execution pipeline. Understanding what happens during this process helps you interpret results and troubleshoot issues.

Execution Pipeline

Tokenization: Your input text (system prompt plus user query) is converted into tokens using the model's tokenizer. This is why character count does not directly map to token count.
Context preparation: The tokenized input is formatted according to the model's expected input structure, which may include special tokens or formatting markers.
Model loading (if needed): If the selected model is not already in memory, the system loads the model weights onto the GPU. This step introduces latency on first execution.
Inference execution: The model processes your input and generates tokens one at a time (autoregressive generation) until reaching a stop condition.
Response delivery: Generated tokens are decoded back into text and streamed to the output panel as they are produced.

PreviousConfiguration Panel NextPerformance Metrics

Last updated 19 hours ago

Good night

hashtagEntering Your Query

hashtagEffective Prompting Techniques

hashtagRunning Inference

hashtagExecution Pipeline

Entering Your Query

Effective Prompting Techniques

Running Inference

Execution Pipeline