How to configure the pipeline

The global_vars.py file acts as the control panel for your BojAI pipeline.

It defines:

The task type and evaluation metric
Training hyperparameters
How to load your model and tokenizer
Initialization logic if needed
UI/CLI behavior toggles
Optional model or tokenizer choices for users

🎛️ UI and CLI Behavior

The browseDict variable controls how BojAI behaves in the interface and command line.

Example settings:

Should the deploy tab accept uploaded data?
Should the user be allowed to choose between models?
What type of input is expected: image, text, or audio?

browseDict = {
    "train": False,
    "prep": False,
    "deploy_new_data": False,
    "use_model_upload": True,
    "use_model_text": "Enter one picture to see output",
    "init": False,
    "type": 0,
    "eval matrice": "perplexity",
    "options": 0,
    "options-where": -1,
}

🧩 Optional Model/Tokenizer Options

If you want to let users choose from multiple models or tokenizers, define the options like so:

from model import CNNModel, TransformerModel

options = {
    "cnn": CNNModel,
    "transformer": TransformerModel
}

Make sure "options" is enabled in browseDict, and set "options-where" to:

0 → tokenizer
1 → model

✅ You’re Ready!

You’ve now configured the final part of your pipeline.
Everything else—data loading, training, usage—was already connected in your other files.

▶️ Use Your Full Pipeline

Make sure you’ve built your pipeline before running it.

In CLI:

bojai start --pipeline give-it-a-name --directory where/the/editing/files/are --stage train

In UI:

bojai start --pipeline give-it-a-name --directory where/the/editing/files/are --stage train --ui

🚀 You’re now ready to train, evaluate, and deploy with your own BojAI pipeline!

Updated on 19 Apr 2025