How to build and version control pipelines

Before using the pipeline, you need to build it. Building it merges the custom logic of the pipeline itself with the CLI and UI logic so that you can later use it.

Building a pre-trained pipeline

BojAI contains several pre-trained pipelines, which you do not need to code and you can just use to process, train, and deploy pipelines. You can follow the steps below to use them:

  1. Run bojai list --pipelines to see what pipelines are available. You can also optionally refer to the pipeline’s documentation to get details on what it does, how it works, and what models it uses.

  2. Once you know which pipeline to use, run bojai build --pipeline chosen-pipeline-name.

Now you have a built pipeline and you can use it either through command-line interface, or visual interface.

Building a custom pipeline

If you want to build your custom pipeline, you need to create it first by following the steps below:

  1. Run bojai create --pipeline your-custom-pipeline --directory directory-where-to-save-pipeline-folder.

  2. Access the pipeline folder in the directory you specified, and insert your data processing, training, evaluation, and model usage logic.

Once coding of one stage (marked in the .py files in your directory) is finished, you can build your model by following the steps below:

  1. Run bojai list --built to see if you had already built your pipeline. If you did, you can rebuild it but the files will be overwritten.

  2. Run bojai build --directory custom-pipeline-directory if this is the first time you build it, or
    Run bojai build --directory custom-pipeline-directory --replace if you had already built it and want to update it.

Now you have a built pipeline and you can use it either through command-line interface, or visual interface.

Version control through git

You can use git to version control your custom pipeline. This is especially useful if you’re actively improving the processor, trainer, or user logic over time, or working with a team.

Version control happens on the custom pipeline’s directory — the one you pass to --directory when building.

Follow the steps below to version control your pipeline:

  1. Go to your custom pipeline directory:
    cd path/to/your/custom-pipeline

  2. Initialize a git repository in the folder:
    git init

  3. Add the files you want to track:
    git add .

  4. Commit your changes:
    git commit -m "Initial version of my custom pipeline"

  5. (Optional) Connect to a remote repository (like GitHub or GitLab) to back up your work or collaborate:

    git remote add origin https://github.com/yourusername/yourrepo.git
    git push -u origin main
    
  6. Every time you make changes to your code (like improving your trainer or processor), you can repeat the following:

    git add .
    git commit -m "Describe what you changed"
    git push
    
  7. Finally, after each git update or changes to your pipeline, remember to rebuild your pipeline before using it:
    bojai build --directory path/to/your/custom-pipeline --replace

This makes sure the most recent version of your code is what gets run when you use BojAI.

You now have a version-controlled pipeline that is easier to manage, track, and share.