data.py#

class mlui.classes.data.Data[source]#

Bases: object

Class representing a data file content.

This class provides methods for managing and interacting with a DataFrame constructed from the data file.

reset_state()[source]#

Reset the state of the DataFrame.

This method resets the internal state of the DataFrame to an empty object.

update_state()[source]#

Update the internal state of the DataFrame, resetting its columns and unused columns.

upload(buff)[source]#

Upload data from a file into the DataFrame.

Parameters:

buff (file-like object) – Byte buffer containing the data.

Raises:

UploadError – If there is an issue parsing the file. If there is an issue reading the file to the DataFrame. If there is an issue validating the DataFrame.

set_unused_columns(available, selected)[source]#

Set the unused columns based on the available and selected columns.

Parameters:
  • available (list of str) – Available columns to choose from.

  • selected (list of str) – Columns to set as used.

Raises:

SetError – If there is an issue setting the unused columns.

get_unused_columns()[source]#

Get the currently unused columns of the DataFrame.

Returns:

Currently unused columns.

Return type:

list of str

get_stats()[source]#

Get descriptive statistics and data types information for the DataFrame.

Returns:

DataFrame containing descriptive statistics and data types information.

Return type:

DataFrame

Raises:

PlotError – If there is an issue generating the statistics.

plot_columns(x, y, points)[source]#

Plot columns from the DataFrame.

Parameters:
  • x (str or None) – Column to use for the x-axis.

  • y (str or None) – Column to use for the y-axis.

  • points (bool) – Whether to include points on the plot.

Returns:

Altair chart representing the plot.

Return type:

Chart

Raises:

PlotError – If there is an issue generating the plot.

property dataframe: DataFrame#

Copy of the DataFrame.

property columns: list[str]#

Names of the columns in the DataFrame.

property has_nans: bool#

True if there are NaN values in the DataFrame, False otherwise.

property has_nonnumeric_dtypes: bool#

True if the DataFrame contains columns with non-numeric data types, False otherwise.

property empty: bool#

True if the DataFrame is empty, False otherwise.