VevestaX — Library to track machine learning experiments and data in an excel file with 5 lines of code

MLOps library to track ML experiments

Nov 07, 2021

VevestaX is an open source initiative by Vevesta Labs. VevestaX helps manage failed and successful experiments and corresponding features. The library generates an excel file to track the data and experiments. It can currently only be used in Jupyter notebook environment. With approximately 5 lines of code, you can track both your experiments as well as the feature set.

Step 1 : How to install vevestaX

pip install vevestaX

Step 2 : How to import the library and create an VevestaX object

#Import library

From vevestaX import vevesta as vev

V = vev.Experiment()

Step 3 : How to extract features from the data source

df = pandas.read_csv(“data”)

V.dataSourcing = df

#Alternatively,ds can be used instead of dataSourcing

V.ds = df

Step 4: How to extract features engineered during the modelling process.

V.featureEngineering = df

#Alternatively, fe can be used instead of featureEngineering

V.fe = df

Step 5: Track the variables you wish to track by nesting it between V.startModelling() (or in short V.start() ) and V.endModelling() (or in short, V.end() ). These two functions can be called, as a block, multiple times within in the code.

V.start()

#Do some modelling

epoch = 100

precision = 0.98

V.end()

This code block tracks parameters used within this block and dumps in into the excel

Step 6: Dump the experiment in an output excel file with the function call to dump with optional parameters, namely, filename, message and version.

V.dump(techniqueUsed = “XGBoost”, filename= “sample.xlsx”, message = “test message”, version = 1)

This writes the experiment to file sample.xlsx. Alternatively, we can just write the experiment to default file, vevesta.xlsx by using the following:

V.dump(techniqueUsed = “XGBoost”)

Output file contains 4 tabs. The first tab, dataSourcing, has initial set of features in an experiments. A value = 1 means feature was present in that run and 0 means it was absent.

Similar to dataSourcing tab, tab for features engineered. This tab lists features captured with V.fe.

The modelling tab for tracked modelling parameters.

The output file has one more tab i.e. messages.

This repository is open sourced and can be accessed at following link:

https://github.com/Vevesta/VevestaX

You can access more details about the tool at www.vevesta.com. You can reach out with suggestions and feedback about the Vevestax at vevestax@vevesta.com.

100 early birds who login to Vevesta get free subscription for 3 months.

Get 90 day free trial

Subscribe to our weekly newsletter to stay updated on latest machine learning/MLOps articles.

Machine Learning Diaries

Discussion about this post