pkoffee

PKoffee - Coffee Productivity Analysis Package.

A comprehensive toolkit for analyzing the relationship between coffee consumption and productivity through statistical modeling and visualization.

Submodules

Command-line interface for PKoffee analysis.

exception pkoffee.cli.MissingVisualizationDependenciesError[source]

Bases: ImportError

Error when visualization dependencies are missing.

class pkoffee.cli.PKoffeCommands(*values)[source]

Bases: StrEnum

Commands of the pkoffee CLI.

ANALYZE = 'analyze'

PLOT = 'plot'

exception pkoffee.cli.UnsupportedCommandError(command: str)[source]

Bases: NotImplementedError

Unsupported Command Error.

class pkoffee.cli.PKoffeArgParseFormatter(prog, indent_increment=2, max_help_position=24, width=None)[source]

Bases: RawTextHelpFormatter, ArgumentDefaultsHelpFormatter

Combine the RawTextHelpFormatter and ArgumentDefaultsHelpFormatter.

The purpose of this class is to not format description and epilog of the parser (behavior of RawTextHelpFormatter) while showing defaults for arguments (behavior of ArgumentDefaultsHelpFormatter).

pkoffee.cli.pkoffe_argparser() → ArgumentParser[source]: Define the arguments of the PKoffe CLI.

pkoffee.cli.main() → None[source]: Parse arguments and execute input command.

Data loading and preprocessing utilities for coffee productivity analysis.

class pkoffee.data.RequiredColumn(*values)[source]

Bases: StrEnum

Required Columns in the coffe productivity CSV data.

CUPS = 'cups'

PRODUCTIVITY = 'productivity'

exception pkoffee.data.CSVReadError(filepath: Path)[source]

Bases: RuntimeError

Exception for data input failure.

exception pkoffee.data.MissingColumnsError(missing_columns: set[RequiredColumn])[source]

Bases: ValueError

Exception for missing required columns in data.

exception pkoffee.data.ColumnTypeError(col: RequiredColumn, dtype: dtype)[source]

Bases: ValueError

Exception for invalid column type.

pkoffee.data.validate(data: DataFrame) → None[source]

Validate data content by checking column presence and types.

Parameters:

data (pd.DataFrame) – Panda Dataframe to validate.

Raises:

MissingColumnsError – If a required column is missing from the DataFrame.
ColumnTypeError – If a required column has an invalid type. Required columns are expected to have a numerical type.

pkoffee.data.curate(data: DataFrame) → DataFrame[source]

Curate data by removing rows with NaN values.

Parameters:: data (pd.DataFrame) – DataFrame content to curate.
Returns:: The curated DataFrame, possibly with removed rows.
Return type:: pd.DataFrame

pkoffee.data.load_csv(filepath: Path) → DataFrame[source]

Load coffee productivity data from a CSV file.

Parameters:

filepath (str or Path) – Path to the CSV file containing the data. Expected columns: ‘cups’ (int) and ‘productivity’ (float).

Returns:

DataFrame with validated columns ‘cups’ and ‘productivity’.

Return type:

pd.DataFrame

Raises:

CSVReadError – If the CSV reading fails
ColumnTypeError – If required columns contain invalid data.
FileNotFoundError – If the specified data file does not exist.
MissingColumnsError – If required columns are missing.

Examples

>>> prod_data = load_csv(Path("coffee_productivity.csv"))
>>> print(prod_data.head())
   cups  productivity
0     1           2.1

pkoffee.data.extract_arrays(data: DataFrame) → tuple[ndarray[tuple[int], dtype[float32]], ndarray[tuple[int], dtype[float32]]][source]

Extract cups and productivity as numpy arrays from a DataFrame.

Parameters:: data (pd.DataFrame) – DataFrame containing ‘cups’ and ‘productivity’ columns.
Returns:: Tuple of (cups, productivity) as float arrays.
Return type:: tuple[np.ndarray, np.ndarray]

Examples

>>> data = pd.DataFrame({"cups": [1, 3, 5], "productivity": [0.3, 1.5, 0.8]})
>>> cups, productivity = extract_arrays(data)
>>> print(cups.shape, productivity.shape)
(3,) (3,)

Mathematical models for coffee productivity relationships.

This module provides various parametric models that can be fitted to coffee consumption vs productivity data.

exception pkoffee.fit_model.FunctionNotFoundInMappingError(function: type[ParametricFunction], mapping: Mapping)[source]

Bases: KeyError

Exception when a function is not found in the function to str mapping.

exception pkoffee.fit_model.FunctionIdNotFoundInMappingError(function_id: str, mapping: Mapping)[source]

Bases: KeyError

Exception when a function Identifier is not found in the function Id to function mapping.

exception pkoffee.fit_model.ModelParsingError(model_dict: Mapping)[source]

Bases: ValueError

Exception when a model dictionary representation can not be parsed into a model.

class pkoffee.fit_model.Model(name: str, function: ParametricFunction, params: dict[str, float32], bounds: ParametersBounds, r_squared: float32 = np.float32(-inf))[source]

Bases: object

Model defined by a prediction function, parameters and parameter’s bounds.

name

Name of the model

Type:: str

function

The model prediction function

Type:: ParametricFunction

params

Model parameters passed to the predict function

Type:: tuple[data_types]

bounds

Boundary values for the model parameters.

Type:: ParametersBounds

name: str

function: ParametricFunction

params: dict[str, float32]

bounds: ParametersBounds

r_squared: float32 = np.float32(-inf)

predict(x: AnyShapeDataDtypeArray) → AnyShapeDataDtypeArray[source]

Evaluate the model on input x.

Parameters:: x (np.ndarray) – The model input as 1D array.
Returns:: Prediction of the model, same shape as x.
Return type:: np.ndarray

classmethod sort(models: list[Self]) → None[source]

Sort models by R² (descending), in-place.

Parameters:: models (list[Self]) – List of models to sort by R²

to_dict(function_to_str: Mapping) → dict[source]

Convert model to pure python dictionary representation of the model.

Numbers are converted to python’s floats, and the function is encoded as a string according to function_to_str.

Parameters:: function_to_str (Mapping) – Dict mapping function classes to a string identifier. Ex: {pkoffee.fit_model.Quadratic: “quadratic”}
Returns:: Dictionary representation of a model.
Return type:: dict

Examples

>>> from pkoffee.data import data_dtype
>>> from pkoffee.fit_model_io import pkoffee_function_id_mapping, Quadratic
>>> def_quad = Model(
...     name="DefaultQuadratic",
...     function=Quadratic(),
...     params=Quadratic.param_guess(y_min=data_dtype(0.5)),
...     bounds=Quadratic.param_bounds(),
... )
>>> def_quad.to_dict(pkoffee_function_id_mapping().inv)
{'name': 'DefaultQuadratic', 'function': 'Quadratic', 'params': {'a0': 0.5, 'a1': 0.0, 'a2': 0.009999999776482582}, 'bounds': {'min': {'a0': -inf, 'a1': -inf, 'a2': -inf}, 'max': {'a0': inf, 'a1': inf, 'a2': inf}}, 'r_squared': -inf}

classmethod from_dict(d: Mapping, str_to_function: Mapping) → Self[source]

Create a model from a dictionary representation.

Parameters:

d (Mapping) – Mapping representation of a Model as return by Model.to_dict
str_to_function (Mapping) – Mapping function identifiers to actual function classes

Returns:

Model instance

Return type:

Self

Examples

>>> from pkoffee.fit_model_io import pkoffee_function_id_mapping
>>> Model.from_dict(
...     {
...         "name": "TestQuadratic",
...         "function": "Quadratic",
...         "params": {"a": 1.0, "b": 0.0, "c": 0.5},
...         "bounds": {"min": {"a": -5.0, "b": -2.0, "c": -1.0}, "max": {"a": 5.0, "b": 2.0, "c": 1.0}},
...         "r_squared": 0.22,
...     },
...     pkoffee_function_id_mapping(),
... )
ModelFit(name='TestQuadratic', R²=0.220)

pkoffee.fit_model.fit_model(x: ndarray, y: ndarray, model: Model, max_iterations: int = 20000) → tuple[Model, ndarray][source]

Fit a single model to the data.

Parameters:

x (np.ndarray) – Input data (independent variable)
y (np.ndarray) – Output data (dependent variable)
model (Model) – Model including function and parameters
max_iterations (int, optional) – Maximum number of optimization iterations, by default 20000

Returns:

tuple with Fitted model and predictions on training data

Return type:

tuple[FittedModel, np.ndarray]

Raises:

ValueError – If either x or y contain NaNs.
RuntimeError – If the least-squares minimization fails.

Input/Output for models.

A model’s ParametricFunction is not directly saved, only an identifier string is written to file. In order to reconstruct the model from the file, the same mapping from function to identifier needs to be available. This module implements a function returning a bidirectional mapping for the ParametricFunction implemented in the pkoffeee package. This mapping can be extended with additional functions to save other models.

pkoffee.fit_model_io.pkoffee_function_id_mapping() → bidict[source]: Bidirectional mapping from string Identifiers to the ParametricFunctions implemented in pkoffee.

exception pkoffee.fit_model_io.UnsupportedModelFormatError(file_format: str)[source]

Bases: NotImplementedError

Exception for non-implemented model file format.

class pkoffee.fit_model_io.ModelFileFormat(*values)[source]

Bases: StrEnum

Available format for saving models to file.

TOML = 'toml'

JSON = 'json'

pkoffee.fit_model_io.file_format_from_path(file_path: Path) → ModelFileFormat[source]

Determine models’s file format from a file path extension.

Parameters:: file_path (Path) – Path to a file, eg “model.toml”
Returns:: File format
Return type:: ModelFileFormat
Raises:: UnsupportedModelFormatError – If the file format is not supported

pkoffee.fit_model_io.save_models_json(model_dicts: Iterable[dict], output_path: Path) → None[source]

Save the model dictionary representation to a json file.

Parameters:

model_dicts (Iterable[dict]) – Models dictionary representation
output_path (Path) – Path to save the models

pkoffee.fit_model_io.save_models_toml(model_dicts: Iterable[dict], output_path: Path) → None[source]

Save the model dictionaries representation to a toml file.

Parameters:

model_dicts (Iterable[dict]) – Models dictionary representation
output_path (Path) – Path to save the models

pkoffee.fit_model_io.save_models(models: Iterable[Model], function_to_str: Mapping, output_path: Path, file_format: ModelFileFormat | None = None) → None[source]

Save the models to disk.

Parameters:

models (Iterable[Model]) – Collection of models to save
function_to_str (Mapping) – Mapping of function to string identifier used as function representation in the model’s file.
output_path (Path) – Path to the model’s file.
file_format (ModelFileFormat) – The format of the model’s file

pkoffee.fit_model_io.load_models_json(model_file: Path, str_to_function: Mapping) → list[Model][source]

Load models from json file.

Parameters:

model_file (Path) – Path to the models’ file
str_to_function (Mapping) – Mapping of function string identifier to function classes

Returns:

Loaded models

Return type:

list[Model]

pkoffee.fit_model_io.load_models_toml(model_file: Path, str_to_function: Mapping) → list[Model][source]

Load models from toml file.

Parameters:

model_file (Path) – Path to the models’ file
str_to_function (Mapping) – Mapping of function string identifier to function classes

Returns:

Loaded models

Return type:

list[Model]

pkoffee.fit_model_io.load_models(model_file: Path, str_to_function: Mapping, file_format: ModelFileFormat | None = None) → list[Model][source]

Load models from file.

Parameters:

model_file (Path) – Path to the model’s file
str_to_function (Mapping) – Mapping of function string identifier to function classes
file_format (ModelFileFormat) – Format of the model file

Returns:

Loaded models

Return type:

list[Model]

Logging utils.

exception pkoffee.log.LogLevelError(s: str)[source]

Bases: KeyError

Error type for unsupported log levels.

class pkoffee.log.LogLevel(*values)[source]

Bases: Enum

Log Level Enumeration.

NOTSET = 0

DEBUG = 10

INFO = 20

WARNING = 30

ERROR = 40

CRITICAL = 50

classmethod from_string(s: str) → Self[source]: Instantiate a LogLevel from a string.

pkoffee.log.log_uncaught_exceptions() → None[source]

Make all uncaught exception to be logged by the default logger.

Keyboard exceptions and children classes are not logged so one can kill the program with ctr+C.

pkoffee.log.init_logging(log_file: Path | None, log_level: LogLevel) → None[source]

(Re-)initialize all loggers.

Parameters:

log_file (Path | None) – Filename to write the log to. Default is None, in which case the log goes to standard output/error.
log_level (str | None) – Logging level. Default is None, in which case the default of logging is used (WARNING).

Model evaluation metrics for assessing fit quality.

exception pkoffee.metrics.SizeMismatchError(size_a: int, size_b: int)[source]

Bases: ValueError

Exception for data input failure.

pkoffee.metrics.check_size_match(array_a: ndarray, array_b: ndarray) → None[source]

Check that 2 array have the same size, throw SizeMismatchError if not.

Parameters:

array_a (np.ndarray) – First array
array_b (np.ndarray) – Second array

Raises:

SizeMismatchError – If the two arrays sizes aren’t equal.

pkoffee.metrics.compute_r2(y_true: ndarray, y_pred: ndarray) → float32[source]

Calculate the coefficient of determination (R² score).

R² indicates the proportion of variance in the dependent variable that is predictable from the independent variable(s).

Parameters:

y_true (np.ndarray) – True observed values.
y_pred (np.ndarray) – Predicted values from the model.

Returns:

R² score. Values closer to 1.0 indicate better fit. Can be negative for very poor fits.

Return type:

float

Notes

R² = 1 - (SS_res / SS_tot) where:

SS_res = Σ(y_true - y_pred)² (residual sum of squares) SS_tot = Σ(y_true - ȳ)² (total sum of squares)

Examples

>>> y_true = np.array([1.0, 2.0, 3.0, 4.0])
>>> y_pred = np.array([1.1, 1.9, 3.1, 3.9])
>>> r2 = compute_r2(y_true, y_pred)
>>> print(f"R² = {r2:.4f}")
R² = 0.9920

pkoffee.metrics.compute_rmse(y_true: ndarray, y_pred: ndarray) → float32[source]

Calculate Root Mean Squared Error (RMSE).

Parameters:

y_true (np.ndarray) – True observed values.
y_pred (np.ndarray) – Predicted values from the model.

Returns:

RMSE value. Lower is better, 0 is perfect fit.

Return type:

float

Examples

>>> y_true = np.array([1.0, 2.0, 3.0, 4.0])
>>> y_pred = np.array([1.1, 1.9, 3.1, 3.9])
>>> rmse = compute_rmse(y_true, y_pred)
>>> print(f"RMSE = {rmse:.4f}")
RMSE = 0.1000

pkoffee.metrics.compute_mae(y_true: ndarray, y_pred: ndarray) → float32[source]

Calculate Mean Absolute Error (MAE).

Parameters:

y_true (np.ndarray) – True observed values.
y_pred (np.ndarray) – Predicted values from the model.

Returns:

MAE value. Lower is better, 0 is perfect fit.

Return type:

float

Examples

>>> y_true = np.array([1.0, 2.0, 3.0, 4.0])
>>> y_pred = np.array([1.1, 1.9, 3.1, 3.9])
>>> mae = compute_mae(y_true, y_pred)
>>> print(f"MAE = {mae:.4f}")
MAE = 0.1000

Parametric functions.

This module provides functions with signature f(x, *args, **kwargs), where x is the function’s input and the other arguments are the function parameters. Functions also provide guesses and boundaries for parameter values.

class pkoffee.parametric_function.ParametersBounds(min: dict[str, float32], max: dict[str, float32])[source]

Bases: NamedTuple

Store the minimum and maximum bounds.

min

Minimum bounds

Type:: dict[str, data_dtype]

max

Maximum bounds

Type:: dict[str, data_dtype]

min: dict[str, float32]: Alias for field number 0

max: dict[str, float32]: Alias for field number 1

class pkoffee.parametric_function.ParametricFunction(*args, **kwargs)[source]

Bases: Protocol

Parametric function API.

abstractmethod classmethod param_guess(*args: Any, **kwargs: Any) → dict[str, float32][source]

Guess values of the ParametricFunction parameters.

The guess values can typically be used as starting values for a fit of the parameters.

The guesses may require some information about the data (eg. range, min/max values) therefore this method is allowed to take any input.

Returns:: Dictionary mapping parameter names to guessed values.
Return type:: dict[str, data_dtype]

abstractmethod classmethod param_bounds() → ParametersBounds[source]

Min/max values of the ParametricFunction parameters.

The ParametersBound dictionaries’ keys are the parameters’ names.

Returns:: min/max values of the parameters.
Return type:: ParametersBounds

class pkoffee.parametric_function.Quadratic[source]

Bases: object

Quadratic (polynomial) function: f(x) = a₀ + a₁x + a₂x².

References

1. Wikipedia contributors. (2025, September 16). Quadratic function. In Wikipedia, The Free Encyclopedia. Retrieved 19:28, December 1, 2025, from https://en.wikipedia.org/w/index.php?title=Quadratic_function&oldid=1311755644

classmethod param_guess(y_min: float32) → dict[str, float32][source]

Parameter guesses for a fit starting values.

The linear coefficient guess is 0.0, and the quadratic coefficient 0.01. The constant term guess is the minimum value of the predictions in the data points: if modeling y = a₀ + a₁x + a₂x², then min(y).

Parameters:: y_min (data_dtype) – The minimal value of the predictions.
Returns:: Dictionary mapping parameter names to guesses.
Return type:: dict[str, data_dtype]

classmethod param_bounds() → ParametersBounds[source]: Boundary values for the `QuadraticFunction.

class pkoffee.parametric_function.MichaelisMentenSaturation[source]

Bases: object

Michaelis-Menten (saturating) model: f(x) = y₀ + Vₘₐₓ·x/(K + x).

This model describes saturation behavior common in enzyme kinetics and can represent diminishing returns.

References

Wikipedia contributors. (2025, December 1). Michaelis-Menten kinetics. In Wikipedia, The Free Encyclopedia. Retrieved 19:32, December 1, 2025, from https://en.wikipedia.org/w/index.php?title=Michaelis%E2%80%93Menten_kinetics&oldid=1325118298

classmethod param_guess(x_min: float32, x_max: float32, y_min: float32, y_max: float32) → dict[str, float32][source]

Parameter guesses for a fit initial values.

x are the function input values, y the predictions in the data points. v_max guess is the prediction range, k the input value at mid-growth is guessed as the input value at 20% of the input range, y0’s guess is the minimum input value.

Parameters:

x_min (data_dtype) – Maximum input value
x_max (data_dtype) – Maximum input value
y_min (data_dtype) – Minimum prediction value
y_max (data_dtype) – Maximum prediction value

Returns:

Dictionary mapping parameter names to guesses.

Return type:

dict[str, data_dtype]

classmethod param_bounds() → ParametersBounds[source]: Boundary values for the MichaelisMentenSaturation.

class pkoffee.parametric_function.Logistic[source]

Bases: object

Logistic (sigmoid) model: f(x) = y₀ + L/(1 + e^(-k(x - x₀))).

Models S-shaped growth with lower and upper asymptotes.

References

Wikipedia contributors. (2025, November 29). Logistic regression. In Wikipedia, The Free Encyclopedia. Retrieved 19:34, December 1, 2025, from https://en.wikipedia.org/w/index.php?title=Logistic_regression&oldid=1324697470

classmethod param_guess(x_min: float32, x_max: float32, y_min: float32, y_max: float32) → dict[str, float32][source]

Parameter guesses for a fit initial values.

x are the function input values, y the predictions in the data points. L is typically close to the prediction values range, k controls the width of the transition interval between the 2 asymptotes (guess is 0.5), x0 the midpoint is in the middle of the input values distributions, y0 the lower asymptote should be close to the minimum of the predictions.

Parameters:

x_min (data_dtype) – Minimum input value
x_max (data_dtype) – Maximum input value
y_min (data_dtype) – Minimum prediction value
y_max (data_dtype) – Maximum prediction value

Returns:

Dictionary mapping parameter names to guesses.

Return type:

dict[str, data_dtype]

classmethod param_bounds() → ParametersBounds[source]: Boundary values for the Logistic.

class pkoffee.parametric_function.PeakModel[source]

Bases: object

Peak model (gamma-like): f(x) = a·x·e^(-x/b).

Models a single peak with exponential decay, useful for representing optimal consumption with negative effects beyond peak.

References

Wikipedia contributors. (2025, November 4). Gamma distribution. In Wikipedia, The Free Encyclopedia. Retrieved 19:38, December 1, 2025, from https://en.wikipedia.org/w/index.php?title=Gamma_distribution&oldid=1320436343

classmethod param_guess(x_min: float32, x_max: float32, y_max: float32) → dict[str, float32][source]

Parameter guesses for a fit initial values.

x are the function input values, y the predictions in the data points. a’s guess is the maximum prediction value, b guess is the middle point of the input value range.

Parameters:

x_min (data_dtype) – Minimum input value
x_max (data_dtype) – Maximum input value
y_max (data_dtype) – Maximum prediction value

Returns:

Dictionary mapping parameter names to guesses.

Return type:

dict[str, data_dtype]

classmethod param_bounds() → ParametersBounds[source]: Boundary values for the Logistic.

class pkoffee.parametric_function.Peak2Model[source]

Bases: object

Quadratic peak model: f(x) = a·x²·e^(-x/b).

Similar to PeakModel but with quadratic growth before decay.

classmethod param_guess(x_min: float32, x_max: float32, y_max: float32) → dict[str, float32][source]

Parameter guesses for a fit initial values.

x are the function input values, y the predictions in the data points. a’s guess is the maximum prediction value divided by the maximum input value squared, b guess is the middle point of the input value range.

Parameters:

x_min (data_dtype) – Minimum input value
x_max (data_dtype) – Maximum input value
y_max (data_dtype) – Maximum prediction value

Returns:

Dictionary mapping parameter names to guesses.

Return type:

dict[str, data_dtype]

classmethod param_bounds() → ParametersBounds[source]: Boundary values for the Logistic.

Coffee Productivity analysis module.

pkoffee.productivity_analysis.default_models(x: ndarray, y: ndarray) → list[Model][source]

Generate model configurations with suited initial parameter guesses.

Parameters:

x (np.ndarray) – Input data (cups).
y (np.ndarray) – Output data (productivity).

Returns:

List of model configurations ready for fitting.

Return type:

list[Model]

pkoffee.productivity_analysis.fit_all_models(data: DataFrame, max_iterations: int = 20000) → list[Model][source]

Fit all available models to the data and rank by R².

Parameters:

data (pd.DataFrame) – DataFrame with ‘cups’ and ‘productivity’ columns.
max_iterations (int, optional) – Maximum iterations for optimization, by default 20000.

Returns:

List of fitted models, sorted by R² (descending). Models for which fitting failed are still in the list with default values (R²=-inf).

Return type:

list[ModelResult]

Examples

>>> data = load_csv(Path(tmpfile.name))
>>> models = fit_all_models(data)
>>> for model in models:
>>>     print(f"{model.name}: R² = {model.r_squared:.4f}")
Quadratic: R² = 0.9978
Peak²: R² = 0.9115
Logistic: R² = 0.7525
Peak: R² = 0.6699
Michaelis-Menten: R² = 0.2347

pkoffee.productivity_analysis.format_model_rankings(fitted_models: list[Model]) → str[source]

Print a formatted table of model rankings.

Parameters:: fitted_models (list[ModelResult]) – List of fitted models, should be sorted by R².

Examples

>>> from pkoffee.data import load_csv
>>> from pkoffee.productivity_analysis import fit_all_models
>>> data = load_csv(Path("coffee_productivity.csv")
>>> models = fit_all_models(data)
>>> print(format_model_rankings(models))
Model Rankings:
══════════════════════════════════════════════════
Rank   Model                R² Score
══════════════════════════════════════════════════
1      Quadratic            0.9978
2      Peak²                0.9115
3      Logistic             0.7525
4      Peak                 0.6699
5      Michaelis-Menten     0.2347
══════════════════════════════════════════════════

pkoffee.productivity_analysis.analyze(args: Namespace) → None[source]

Fit models on input data and save them to file.

Parameters:: args (argparse.Namespace) – Parsed command-line arguments.

Visualization utilities for coffee productivity analysis.

class pkoffee.visualization.Show(*values)[source]

Bases: Enum

To show or not to show a figure.

YES = 1

NO = 2

exception pkoffee.visualization.NoModelProvidedError[source]

Bases: ValueError

Exception for data input failure.

class pkoffee.visualization.FigureParameters(y_limits: tuple[float, float] | None = None, figsize: tuple[float, float] | None = (12, 7), dpi: int = 150)[source]

Bases: NamedTuple

Usual parameters of matlplotlib.figure.Figure.

y_limits

Limits of the y axis (min, max), default is None (to let matplotlib determine the values).

Type:: tuple[float, float] | None

figsize

Figure size in inches (matplotlib unit…) as (width, height). Default is (12, 7)

Type:: tuple[float, float] | None

dpi

Drop Per Inch (number of ink droplets per inch) to use for the figure. Default is 150.

Type:: int

y_limits: tuple[float, float] | None: Alias for field number 0

figsize: tuple[float, float] | None: Alias for field number 1

dpi: int: Alias for field number 2

pkoffee.visualization.draw_data_violin(ax: Axes, data: DataFrame) → None[source]

Draw a violin plot of the data on ax.

Parameters:

ax (Axes) – Axes onto which to draw
data (pd.DataFrame) – The DataFrame with the data to draw

pkoffee.visualization.draw_model_lines(ax: Axes, x_smooth: ndarray, y_smooth: list[ndarray | None], labels: list[str], fig_params: FigureParameters) → None[source]

Draw the models prediction lines onto ax.

Parameters:

ax (Axes) – Axe onto which to draw
x_smooth (np.ndarray) – x values of the line points
y_smooth (list[np.ndarray | None]) – List of y values of the line points, one per element in labels
labels (list[str]) – List of labels to use in the plot legend
fig_params (FigureParameters) – Figure parameters

pkoffee.visualization.plot_models(data: DataFrame, fitted_models: list[Model], output_path: Path | None = None, fig_params: FigureParameters | None = None, show: Show = Show.YES) → None[source]

Create a comprehensive analysis plot with data distribution and model fits.

Parameters:

data (pd.DataFrame) – DataFrame containing ‘cups’ and ‘productivity’ columns.
fitted_models (list[ModelResult]) – List of fitted models to overlay on the plot.
fig_params (FigureParameters | None) – Figure parameters
output_path (Path | None) – Path to save the figure. If None, figure is not saved.
show (Show) – Whether to display the plot, by default YES.

Returns:

The created matplotlib figure.

Return type:

plt.Figure

Examples

>>> from pkoffee.data import load_csv
>>> from pkoffee.productivity_analysis import fit_all_models
>>> data = load_csv(Path("coffee_productivity.csv"))
>>> models = fit_all_models(data)
>>> plot_models(data, models, Path("analysis.png"))

pkoffee.visualization.create_comparison_plot(data: DataFrame, fitted_models: list[Model], output_path: Path | None = None, fig_params: FigureParameters | None = None, show: Show = Show.NO) → None[source]

Create a multi-panel comparison plot showing each model separately.

Parameters:

data (pd.DataFrame) – DataFrame containing ‘cups’ and ‘productivity’ columns.
fitted_models (list[ModelResult]) – List of fitted models to display.
output_path (str or Path, optional) – Path to save the figure.
fig_params (FigureParameters) – Configuration value for matlplotlib figure
show (Show) – Whether to show the figure or not.

pkoffee.visualization.visualize(args: Namespace) → None[source]

Plot model predictions and data.

Parameters:: args (argparse.Namespace) – Parsed command-line arguments.