Async call to IBM watsonx.ai
asyncio
is commonly used to perform multiple operation parallelly, by using the async/await
syntax. The python libary provided by IBM yet to have the async function for generate
, and hence this article provides a simple example of using aiohttp
library to make async call to IBM watsonx REST API — https://cloud.ibm.com/apidocs/watsonx-ai. Let’s explore the code a bit details.
Endpoint
There are 2 IBM watsonx endpoint (https://cloud.ibm.com/apidocs/watsonx-ai#endpoint-url):
- Dallas:
https://us-south.ml.cloud.ibm.com
- Frankfurt —
https://eu-de.ml.cloud.ibm.com
Authentication
Access Token is needed to make the REST API call. In this case, we can use IBM Cloud API Key to generate access token by using ibmcloud-iam
Python SDK.
from ibmcloud_iam.token import TokenManager
tm = TokenManager(api_key=<IBM_CLOUD_API_KEY>)
token = tm.get_token()
watsonx Infer text payload
There are few parameters are needed to pass to watsonx to generate response, such as model_id, input, project_id and so on. It is expected to be in raw form, we can use json.dumps to get the payload in right format before sending to watsonx’s endpoint.
data = {
"model_id": MODEL_ID,
"input": PROMPT,
"parameters": {
"max_new_tokens": 100,
"time_limit": 1000
},
"project_id": PROJECT_ID
}
json.dumps(data)
Full detail of payload is at https://cloud.ibm.com/apidocs/watsonx-ai#text-generation.
Using aiohttp
ClientSession
Import aiohttp and we can use ClientSession
to establish a session and the response isClientResponse
.
import aiohttp
async with aiohttp.ClientSession(connector=aiohttp.TCPConnector(ssl=False)) as session:
async with session.post(WATSONX_ENDPOINT,
data=json.dumps(data),
headers=headers,
) as response:
Piecing everything together
Now we can assemble all the coding together. Full code is as below:
import aiohttp
import asyncio
from ibmcloud_iam.token import TokenManager
import json
IBM_CLOUD_KEY = "IBM_CLOUD_API_KEY"
MODEL_ID = "ibm/granite-13b-instruct-v2"
PROMPT = "how far is paris from bangalore:"
PROJECT_ID = "PROJECT_ID"
WATSONX_ENDPOINT = "https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-02"
async def main():
tm = TokenManager(api_key=IBM_CLOUD_KEY)
token = tm.get_token()
headers = {
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": "Bearer " + token
}
data = {
"model_id": MODEL_ID,
"input": PROMPT,
"parameters": {
"max_new_tokens": 100,
"time_limit": 1000
},
"project_id": PROJECT_ID
}
print(json.dumps(data))
async with aiohttp.ClientSession(connector=aiohttp.TCPConnector(ssl=False)) as session:
async with session.post(WATSONX_ENDPOINT,
data=json.dumps(data),
headers=headers,
) as response:
result = await response.json()
generated_text = result['results'][0]['generated_text']
print(generated_text)
asyncio.run(main())
Happy coding!