{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c2b07c61",
   "metadata": {},
   "source": [
    "# PKoffee Analysis example\n",
    "\n",
    "Welcome to the **PKoffee** project analysis notebook! ☕️\n",
    "\n",
    "This project aims to analyze the relationship between coffee consumption (in cups) and productivity. We use several mathematical models to find the best fit for our data.\n",
    "\n",
    "## Project Structure\n",
    "\n",
    "- `pkoffee/data.py`: Data loading and cleaning.\n",
    "- `pkoffee/parametric_function.py`: Definition of various models (Quadratic, Logistic, etc.).\n",
    "- `pkoffee/productivity_analysis.py`: Logic to fit models and rank them.\n",
    "- `pkoffee/visualization.py`: Utilities to plot results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8b520d72",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "from pathlib import Path\n",
    "import os\n",
    "from pkoffee.data import load_csv\n",
    "from pkoffee.productivity_analysis import fit_all_models, format_model_rankings\n",
    "from pkoffee.visualization import plot_models, Show"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51afb4da",
   "metadata": {},
   "source": [
    "## 1. Load the data\n",
    "\n",
    "The experimental data are in file `../analysis/coffee_productivity.csv`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f67ccbe6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the data\n",
    "data_path = Path(\"../analysis/coffee_productivity.csv\")\n",
    "data = load_csv(data_path)\n",
    "\n",
    "print(f\"Loaded {len(data)} data points from {data_path}\")\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3869fd99",
   "metadata": {},
   "source": [
    "## 2. Model Fitting\n",
    "\n",
    "We will now fit several parametric models to the data:\n",
    "- **Quadratic**: $f(x) = a_0 + a_1 x + a_2 x^2$\n",
    "- **Michaelis-Menten**: $f(x) = y_0 + V_{max} \\frac{x}{K + x}$\n",
    "- **Logistic**: $f(x) = y_0 + \\frac{L}{1 + e^{-k(x - x_0)}}$\n",
    "- **Peak Model**: $f(x) = a \\cdot x \\cdot e^{-x/b}$\n",
    "\n",
    "The `fit_all_models` function will run the optimization for all these models and rank them using the $R^2$ score."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0475ff25",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Fit all models\n",
    "fitted_models = fit_all_models(data)\n",
    "\n",
    "# Print rankings\n",
    "print(\"Model Rankings (by R²):\")\n",
    "print(format_model_rankings(fitted_models))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a091164",
   "metadata": {},
   "source": [
    "## 3. Visualization\n",
    "\n",
    "Finally, we visualize the data distribution using a violin plot and overlay the fitted model curves to see which one accurately captures the \"coffee sweet spot\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32cb8acd",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_models(data, fitted_models, show=Show.YES)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}