Automated scaling of IBM Code Engine application with Python
Serverless application saves a lot of unused resource and drives down the cost of consumption, however it also comes with a drawback. Cold start time of any serverless application is a challenge. Recently I did more works in building Generative AI application and deployed in IBM Code Engine. I realize that the startup time of serverless application takes longer and longer, with the growing size of container images. This definitely not the best user experience, and it gave impression to client that the application is not working and slow. Theoretically, I can always set Mininum number of instances
to 1, which means it will always be at least a single pod that can ideally promptly service requests even if there has been no activity, however this defeat the purpose of serverless in the first place.
I wrote a simple python application to automatically set Min number of upon scheduling. What is does is simply reverse the value between 1
and 0
based on the period I want the application to have at least single pod running, especially during office hour of Monday to Friday.
In this article I will share how the application looks like.
Python code
The python code is relatively simple, less than 60 lines. The logic flow is as below:
A few environment variables are needed for this application to run:
- IBM_CLOUD_KEY: Apikey to access programmatically into IBM Cloud API.
- PROJECT_ID: ID of Code Engine Project. This can be retrieved from IBM Cloud console URL, example: https://cloud.ibm.com/codeengine/project/jp-tok/
PROJECT_ID
/overview - APPLICATION_NAME: Name of the application. If you have multiple applications you would like to schedule, just need to comma to separate, example:
tecbackend-app,tec-frontend
- TIMEZONE_OFFSET : Timezone adjustment based on UTC/GMT, example: Singapore is
8
to represent GMT+8
Coding wise, in order to make the update successfully. A parameter called entity_tag
is required to ensure the right version of application is updated. More information can be found at https://cloud.ibm.com/apidocs/codeengine/v2?code=python#update-app
Once the application is ready, I build it as docker container and uploaded to IBM Cloud Container Registry. I named the container codeengine-ro
(ro stands for Resource Optimizer).
The source code is available at https://github.com/ongkhaiwei/code-engine-resource-optimizer
Create scheduling in IBM Cloud Code Engine
First thing to do here is to create as a job
in IBM Cloud Code Engine.
Once the job is created, lets define when it will be executed. To do this, let create an event subscription
. We select periodic timer
so that we can decide when exactly it will run, and give a name for the event subscription.
Schedule
section is the interesting part where we define when it will execute. Either you can use the dropdown, or for advance user, you can create your own cron schedule expression. I used this website https://crontab.guru/ to help me to create the expression I want.
Click Next to proceed to Event consumer
section. This is where you select the job you have created previous to run.
Proceed to the end to create event subscription. A success notification will popup and you are done!
The job will just completed in split seconds.
You can go back to Application page to check the status of deployment
In this case you can define 2 event subscription to scale up and scale down. tec-timer-start
has cron expression of 0 1 * * 1,2,3,4,5
, which means it runs on every Monday to Friday 9:00am Singapore time;tec-timer-end
has cron expression of 0 10 * * 1,2,3,4,5
, which means it runs every Monday to Friday 6:00pm Singapore time.
Happy coding!