{ "cells": [ { "cell_type": "markdown", "id": "c2b07c61", "metadata": {}, "source": [ "# PKoffee Analysis example\n", "\n", "Welcome to the **PKoffee** project analysis notebook! ☕️\n", "\n", "This project aims to analyze the relationship between coffee consumption (in cups) and productivity. We use several mathematical models to find the best fit for our data.\n", "\n", "## Project Structure\n", "\n", "- `pkoffee/data.py`: Data loading and cleaning.\n", "- `pkoffee/parametric_function.py`: Definition of various models (Quadratic, Logistic, etc.).\n", "- `pkoffee/productivity_analysis.py`: Logic to fit models and rank them.\n", "- `pkoffee/visualization.py`: Utilities to plot results." ] }, { "cell_type": "code", "execution_count": null, "id": "8b520d72", "metadata": {}, "outputs": [], "source": [ "import sys\n", "from pathlib import Path\n", "import os\n", "from pkoffee.data import load_csv\n", "from pkoffee.productivity_analysis import fit_all_models, format_model_rankings\n", "from pkoffee.visualization import plot_models, Show" ] }, { "cell_type": "markdown", "id": "51afb4da", "metadata": {}, "source": [ "## 1. Load the data\n", "\n", "The experimental data are in file `../analysis/coffee_productivity.csv`" ] }, { "cell_type": "code", "execution_count": null, "id": "f67ccbe6", "metadata": {}, "outputs": [], "source": [ "# Load the data\n", "data_path = Path(\"../analysis/coffee_productivity.csv\")\n", "data = load_csv(data_path)\n", "\n", "print(f\"Loaded {len(data)} data points from {data_path}\")\n", "data.head()" ] }, { "cell_type": "markdown", "id": "3869fd99", "metadata": {}, "source": [ "## 2. Model Fitting\n", "\n", "We will now fit several parametric models to the data:\n", "- **Quadratic**: $f(x) = a_0 + a_1 x + a_2 x^2$\n", "- **Michaelis-Menten**: $f(x) = y_0 + V_{max} \\frac{x}{K + x}$\n", "- **Logistic**: $f(x) = y_0 + \\frac{L}{1 + e^{-k(x - x_0)}}$\n", "- **Peak Model**: $f(x) = a \\cdot x \\cdot e^{-x/b}$\n", "\n", "The `fit_all_models` function will run the optimization for all these models and rank them using the $R^2$ score." ] }, { "cell_type": "code", "execution_count": null, "id": "0475ff25", "metadata": {}, "outputs": [], "source": [ "# Fit all models\n", "fitted_models = fit_all_models(data)\n", "\n", "# Print rankings\n", "print(\"Model Rankings (by R²):\")\n", "print(format_model_rankings(fitted_models))" ] }, { "cell_type": "markdown", "id": "1a091164", "metadata": {}, "source": [ "## 3. Visualization\n", "\n", "Finally, we visualize the data distribution using a violin plot and overlay the fitted model curves to see which one accurately captures the \"coffee sweet spot\"." ] }, { "cell_type": "code", "execution_count": null, "id": "32cb8acd", "metadata": {}, "outputs": [], "source": [ "plot_models(data, fitted_models, show=Show.YES)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.10" } }, "nbformat": 4, "nbformat_minor": 5 }