I created a canister on my computer, and deployed as indicated in tutorial.
Then I added a tokenizer in pure JavaScript, and a loop - which is a workaround for short responses from the model.
The whole thing is in the form of a library, so I can use it in my own projects in a future.