Async call to IBM watsonx.ai

Ong Khai Wei
2 min readMay 12, 2024

--

asyncio is commonly used to perform multiple operation parallelly, by using the async/await syntax. The python libary provided by IBM yet to have the async function for generate, and hence this article provides a simple example of using aiohttp library to make async call to IBM watsonx REST API — https://cloud.ibm.com/apidocs/watsonx-ai. Let’s explore the code a bit details.

IBM watsonx REST API Documentation

Endpoint

There are 2 IBM watsonx endpoint (https://cloud.ibm.com/apidocs/watsonx-ai#endpoint-url):

Authentication

Access Token is needed to make the REST API call. In this case, we can use IBM Cloud API Key to generate access token by using ibmcloud-iam Python SDK.

from ibmcloud_iam.token import TokenManager

tm = TokenManager(api_key=<IBM_CLOUD_API_KEY>)
token = tm.get_token()

watsonx Infer text payload

There are few parameters are needed to pass to watsonx to generate response, such as model_id, input, project_id and so on. It is expected to be in raw form, we can use json.dumps to get the payload in right format before sending to watsonx’s endpoint.

data = {
"model_id": MODEL_ID,
"input": PROMPT,
"parameters": {
"max_new_tokens": 100,
"time_limit": 1000
},
"project_id": PROJECT_ID
}

json.dumps(data)

Full detail of payload is at https://cloud.ibm.com/apidocs/watsonx-ai#text-generation.

Using aiohttp ClientSession

Import aiohttp and we can use ClientSession to establish a session and the response isClientResponse .

import aiohttp

async with aiohttp.ClientSession(connector=aiohttp.TCPConnector(ssl=False)) as session:
async with session.post(WATSONX_ENDPOINT,
data=json.dumps(data),
headers=headers,
) as response:

Piecing everything together

Now we can assemble all the coding together. Full code is as below:

import aiohttp
import asyncio
from ibmcloud_iam.token import TokenManager
import json

IBM_CLOUD_KEY = "IBM_CLOUD_API_KEY"
MODEL_ID = "ibm/granite-13b-instruct-v2"
PROMPT = "how far is paris from bangalore:"
PROJECT_ID = "PROJECT_ID"
WATSONX_ENDPOINT = "https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-02"

async def main():

tm = TokenManager(api_key=IBM_CLOUD_KEY)
token = tm.get_token()
headers = {
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": "Bearer " + token
}
data = {
"model_id": MODEL_ID,
"input": PROMPT,
"parameters": {
"max_new_tokens": 100,
"time_limit": 1000
},
"project_id": PROJECT_ID
}
print(json.dumps(data))
async with aiohttp.ClientSession(connector=aiohttp.TCPConnector(ssl=False)) as session:
async with session.post(WATSONX_ENDPOINT,
data=json.dumps(data),
headers=headers,
) as response:
result = await response.json()
generated_text = result['results'][0]['generated_text']
print(generated_text)

asyncio.run(main())

Happy coding!

--

--

Ong Khai Wei
Ong Khai Wei

Written by Ong Khai Wei

Blockchain, Kubernetes, DevOps and Coffee

No responses yet