ChatDLM API Documentation

Parallel Decoding Language Model Based on Mask Diffusion

Faster, more accurate, and more comprehensive responses with high-speed text generation support

1. Authentication

All requests must include an API key in the request header:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Get API Key

You need to register an account on the official website first, complete identity verification, then generate and obtain an API key in the console.

2. Endpoint Details

2.1 Request Method

POST

2.2 Request Route

https://api.chatdlm.com/v1/chat/completions

2.3 Function Description

Generate model responses based on user input, supporting streaming output and parameterized configuration.

3. Request Parameters

Parameter	Type	Required	Default	Range/Enum	Description
`model`	string	Required	-	`ChatDLM`, `ChatDLM-MoE`	Specify the model version to use
`messages`	object[]	Required	-	-	Message array, format: `[{"role": "user", "content": "user input"}]`
`stream`	boolean	Optional	false	-	Enable streaming response (suitable for real-time long text output)
`max_tokens`	integer	Optional	512	1 ≤ x ≤ 8192	Maximum token count for generated content, controls response length
`temperature`	number	Optional	0.7	0.1 ≤ x ≤ 1	Control output randomness, higher values are more random (creative scenarios suggest 0.8+, precise tasks suggest 0.2-)
`top_p`	number	Optional	0.8	0.1 ≤ x ≤ 1	Nucleus sampling parameter, limits probability distribution range, mutually exclusive with temperature

4. Request Examples

4.1 Example One

# Non-streaming request example
curl -X POST "https://api.chatdlm.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ChatDLM",
    "messages": [{"role": "user", "content": "What is 1+1?"}],
    "stream": false,
    "max_tokens": 512,
    "temperature": 0.7,
    "top_p": 0.8
  }'

4.2 Example Two

# Streaming request example (returns SSE format)
curl -X POST "https://api.chatdlm.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ChatDLM",
    "messages": [{"role": "user", "content": "Please explain quantum computing principles"}],
    "stream": true
  }'

5. Response Examples

5.1 Non-streaming Response

{
  "id": "chatcmpl-0196a990-6b5f-73ad-b5d6-b2118e1968fe",
  "object": "chat.completion",
  "created": 1746601536,
  "model": "ChatDLM",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "1+1 equals 2. This is a basic arithmetic operation representing the result of adding two unit quantities."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 32,
    "total_tokens": 82
  }
}

5.2 Streaming Response (SSE format fragments)

data: {"choices":[{"delta":{"content":"Quantum"},"index":0,"finish_reason":null}],"created":1746601536,"id":"chatcmpl-123","model":"ChatDLM","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":" computing"},"index":0,"finish_reason":null}],"created":1746601536,"id":"chatcmpl-123","model":"ChatDLM","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":" utilizes"},"index":0,"finish_reason":null}],"created":1746601536,"id":"chatcmpl-123","model":"ChatDLM","object":"chat.completion.chunk"}

data: [DONE]

6. Error Responses

When a request encounters an error, it returns a JSON object in the following format:

{
  "error": {
    "code": 400,
    "message": "Invalid request parameters",
    "type": "invalid_request_error"
  }
}

Common error codes:

401 Unauthorized: Invalid or missing API key
400 Bad Request: Incorrect request parameter format
429 Too Many Requests: Request frequency exceeds limit
500 Internal Server Error: Server internal error
504 Gateway Time-out: Server request timeout
Request failed.: Please check if input parameters are too long or contain invalid characters

7. Important Notes

All requests must use HTTPS protocol
API keys must be kept strictly confidential, avoid leakage
Free tier has request frequency and daily call count limits, upgrading packages can increase quotas
Model responses are limited by training data and do not guarantee 100% accuracy

1. Authentication​

2. Endpoint Details​

2.1 Request Method​

2.2 Request Route​

2.3 Function Description​

3. Request Parameters​

4. Request Examples​

4.1 Example One​

4.2 Example Two​

5. Response Examples​

5.1 Non-streaming Response​

5.2 Streaming Response (SSE format fragments)​

6. Error Responses​

7. Important Notes​