Overview

AI Inference

AI Inference allows you to run trained models to generate predictions and responses. It removes the need to manage inference servers, scaling, and availability.

AI Inference is offered through two main experiences:

  • Model API

  • Model Playground


Model API

Overview

Model API provides production-ready inference endpoints for AI models. You can integrate these APIs directly into your applications.


Why use Model API

Building and managing inference infrastructure is complex. You need to handle scaling, failures, and performance.

Model API handles these challenges by:

  • Automatically scaling based on traffic

  • Providing stable and secure endpoints

  • Reducing operational overhead

  • Ensuring consistent performance


What Model API provides

  • Hosted inference endpoints

  • Support for popular and validated models

  • Configuration for performance and scaling

  • Optional streaming responses

  • Usage-based billing

You do not need to manage servers or containers.


Who should use Model API

  • Developers building AI-powered applications

  • Teams adding AI features to existing products

  • Startups needing fast time to market

  • Enterprises running production inference workloads


How Model API works

  1. Select a model from the model catalog

  2. Deploy the model as an API endpoint

  3. Configure scaling and compute settings

  4. Send inference requests using HTTP or SDKs

  5. Receive responses in real time

The platform monitors usage and performance automatically.


Model Playground

Overview

Model Playground is an interactive interface for testing and experimenting with AI models. It is designed for quick validation before production deployment.


Why use Model Playground

Testing models only through code can slow down experimentation. Model Playground allows faster iteration and easier collaboration.

It helps you:

  • Validate model behavior

  • Test prompts and parameters

  • Compare outputs across models

  • Reduce trial-and-error during development


What Model Playground provides

  • Web-based UI for model testing

  • Prompt input and output visualization

  • Adjustable inference parameters

  • Support for text-based inputs

  • Easy transition to Model API deployment


Who should use Model Playground

  • ML engineers evaluating models

  • Developers testing prompts and responses

  • Product managers validating AI outputs

  • Teams collaborating on prompt design


How Model Playground works

  1. Select a model in the Playground

  2. Enter a prompt or input text

  3. Adjust parameters such as tokens or temperature

  4. Run inference and view results

  5. Deploy the same model using Model API when ready

No infrastructure setup is required.

Last updated