Overview
AI Inference
AI Inference allows you to run trained models to generate predictions and responses. It removes the need to manage inference servers, scaling, and availability.
AI Inference is offered through two main experiences:
Model API
Model Playground
Model API
Overview
Model API provides production-ready inference endpoints for AI models. You can integrate these APIs directly into your applications.
Why use Model API
Building and managing inference infrastructure is complex. You need to handle scaling, failures, and performance.
Model API handles these challenges by:
Automatically scaling based on traffic
Providing stable and secure endpoints
Reducing operational overhead
Ensuring consistent performance
What Model API provides
Hosted inference endpoints
Support for popular and validated models
Configuration for performance and scaling
Optional streaming responses
Usage-based billing
You do not need to manage servers or containers.
Who should use Model API
Developers building AI-powered applications
Teams adding AI features to existing products
Startups needing fast time to market
Enterprises running production inference workloads
How Model API works
Select a model from the model catalog
Deploy the model as an API endpoint
Configure scaling and compute settings
Send inference requests using HTTP or SDKs
Receive responses in real time
The platform monitors usage and performance automatically.
Model Playground
Overview
Model Playground is an interactive interface for testing and experimenting with AI models. It is designed for quick validation before production deployment.
Why use Model Playground
Testing models only through code can slow down experimentation. Model Playground allows faster iteration and easier collaboration.
It helps you:
Validate model behavior
Test prompts and parameters
Compare outputs across models
Reduce trial-and-error during development
What Model Playground provides
Web-based UI for model testing
Prompt input and output visualization
Adjustable inference parameters
Support for text-based inputs
Easy transition to Model API deployment
Who should use Model Playground
ML engineers evaluating models
Developers testing prompts and responses
Product managers validating AI outputs
Teams collaborating on prompt design
How Model Playground works
Select a model in the Playground
Enter a prompt or input text
Adjust parameters such as tokens or temperature
Run inference and view results
Deploy the same model using Model API when ready
No infrastructure setup is required.
Last updated