diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 7dd9570c74..7bf0c9d462 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -75,6 +75,14 @@ To track your token consumption: 1. Navigate to [**Billing and subscriptions > Usage**](https://cloud.elastic.co/billing/usage) in the {{ecloud}} Console 2. Look for line items where the **Billing dimension** is set to "Inference" +### Fair usage during free trial + +Accounts in the free trial period are subject to token limits that are considered "fair usage". Access to some models may be paused temporarily if this limit is exceeded. + +Fair usage limits while account is in free trial: +- **Elastic Managed LLM:** 100 million input tokens in 24h or 5 million output tokens in 24h +- **ELSER**: 1 billion tokens in 24h + ## Rate limits The service enforces rate limits on an ongoing basis. Exceeding a limit will result in HTTP 429 responses from the server until the sliding window moves on further and parts of the limit resets. @@ -88,7 +96,7 @@ The service enforces rate limits on an ongoing basis. Exceeding a limit will res We limit on both requests per minute and tokens per minute (whichever limit is reached first). -#### Ingest +#### Ingest - 6,000 request per minute - 6,000,000 tokens per minute