# JumpStart Model Evaluation Framework

You've successfully downloaded and extracted the evaluation package. Follow the instructions below to get started.

## Getting Started

**Upload to SageMaker Studio**:
- Upload this entire folder to your JupyterLab space
- Open `jumpstart_model_comparison.ipynb`
- Run the first cell to install all required packages
- Follow the step-by-step instructions in the notebook

## What's Included

```
SageMaker_Repo/
├── README.md                                    # This file
├── jumpstart_model_comparison.ipynb            # Main evaluation notebook
├── sagemaker_production_monitoring.ipynb       # Production monitoring guide
├── src/                                         # Python modules
│   ├── __init__.py                             # Package initialization
│   ├── step_3_dataset.py                      # Dataset loading and CSV handling
│   ├── step_4_functions.py                    # Core evaluation functions
│   ├── step_5_visualization.py                # Results display and tables
│   ├── step_6_comparison.py                   # Model comparison orchestration
│   ├── step_7_visual_performance.py           # Advanced visualizations
│   ├── unsupervised_evaluation.py             # Unsupervised evaluation logic
│   └── unsupervised_visualization.py          # Unsupervised visualizations
└── data/                                        # Sample datasets
    ├── sample_supervised.csv                   # Labeled examples (100 samples)
    └── sample_unsupervised.csv                 # Unlabeled examples (100 samples)
```

**Main Notebook:**
- `jumpstart_model_comparison.ipynb`: Complete evaluation workflow with step-by-step guidance

**Core Modules (src/ directory):**
- `src/step_3_dataset.py`: Handles both custom CSV uploads and built-in test datasets with automatic format detection
- `src/step_4_functions.py`: Contains evaluation functions, metrics calculation, and model invocation logic
- `src/step_5_visualization.py`: Generates performance tables, handles single vs. multiple model display logic
- `src/step_6_comparison.py`: Orchestrates model comparisons and aggregates results across datasets
- `src/step_7_visual_performance.py`: Creates advanced visualizations (confusion matrices, ROC curves, latency plots)

**Specialized Modules (src/ directory):**
- `src/unsupervised_evaluation.py`: Handles evaluation when only text data is available (no labels)
- `src/unsupervised_visualization.py`: Creates confidence distribution plots and prediction analysis

**Sample Data (data/ directory):**
- `data/sample_supervised.csv`: 150+ labeled examples across multiple domains for testing supervised evaluation
- `data/sample_unsupervised.csv`: 150+ unlabeled examples for testing unsupervised evaluation


## Important Notes

- **Upload the entire folder** - All files work together as a framework
- **Start with the main notebook** - `jumpstart_model_comparison.ipynb` guides you through everything
- **Sample data included** - Use the provided datasets or substitute your own
- **Need help?** - All troubleshooting guidance is included in the notebooks
