Latest generation LLM with advanced reasoning, instruction-following, and multilingual support.
Qwen3 32B AWQ is the latest large language model in the Qwen series, offering advancements in reasoning, instruction-following, agent capabilities, and multilingual support. It uses AWQ quantization for efficient inference while maintaining high quality.
Stops generation if the given string is encountered.
curl -X POST "https://api.runpod.ai/v2/qwen3-32b-awq/runsync" \ -H "Authorization: Bearer $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": { "prompt": "Write a Python function that checks if a number is prime:", "max_tokens": 512, "temperature": 0.7 } }'
Qwen3 32B AWQ is fully compatible with the OpenAI API format. You can use the OpenAI Python client to interact with this endpoint.
Python (OpenAI SDK)
from openai import OpenAIclient = OpenAI( api_key=RUNPOD_API_KEY, base_url="https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1",)response = client.chat.completions.create( model="Qwen/Qwen3-32B-AWQ", messages=[ { "role": "system", "content": "You are a helpful coding assistant.", }, { "role": "user", "content": "Write a Python function that checks if a number is prime.", }, ], max_tokens=525,)print(response.choices[0].message.content)
For streaming responses, add stream=True:
Python (Streaming)
response = client.chat.completions.create( model="Qwen/Qwen3-32B-AWQ", messages=[ {"role": "user", "content": "Explain quantum computing in simple terms."} ], max_tokens=525, stream=True,)for chunk in response: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")
curl -X POST "https://api.runpod.ai/v2/qwen3-32b-awq/runsync" \ -H "Authorization: Bearer $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": { "prompt": "Write a Python function that checks if a number is prime:", "max_tokens": 512, "temperature": 0.7 } }'
{ "delayTime": 25, "executionTime": 3153, "id": "sync-0f3288b5-58e8-46fd-ba73-53945f5e8982-u2", "output": [ { "choices": [ { "tokens": [ "def is_prime(n):\n if n <= 1:\n return False\n for i in range(2, int(n**0.5) + 1):\n if n % i == 0:\n return False\n return True" ] } ], "cost": 0.0001, "usage": { "input": 10, "output": 100 } } ], "status": "COMPLETED", "workerId": "pkej0t9bbyjrgy"}