Train an Edge Model with Your Own CSV Dataset

Overview

This tutorial shows you how to train an edge model using your own CSV data with our MLE-Agent.

Prerequisites

Make sure that you have installed MLE-Agent version 0.4.0 or later.

pip install mle-agent -U
# or from source
git clone git@github.com:MLSysOps/MLE-agent.git
pip install -e .

Prepare your CSV data.

We will use the IMDB Dataset of 50K Movie Reviews (opens in a new tab) as an example.

Steps

1. Create a new project.

mle new imdb-sentiment-analysis

You need to pick an LLM provider and input your API key. We recommend using OpenAI for its performance and stability. choose-llm-provider

Then we recommend you enable the Web Search function for better performance by using Travily (opens in a new tab). enable-web-search

2. Start the project and input your requirements.

Go to the project directory and run the following command.

cd imdb-sentiment-analysis
mle start

Then you will be asked to input your CSV data path and the specific requirements for the baseline model. input-csv-data

3. Check and modify the generated research proposal

After a few seconds, you will see the proposal,

Of course, you can always modify it by inputting your detailed suggestions. modify-proposal

4. Code task breakdown

After you approve the proposal, the system will come up with a development plan.

development-plan

No problem, you can change the task breakdown by leaving your thoughts.

5. Code and Debug

Once you approve the task breakdown, the system will automatically generate code for your and also debug if you choose to do so.

code-and-debug

Summary

In this tutorial, we showed you how to build a baseline model using your own CSV data with our MLE-Agent. If you have any questions, please feel free to contact us and issue a new GitHub issue (opens in a new tab).

Start a Kaggle Task Use Ollama for Your Tasks