Training SOEN Models

Understanding how to train superconducting optoelectronic neural networks

Training Overview

Training SOEN models involves optimizing the parameters of superconducting circuits to perform specific computational tasks. Unlike traditional neural networks, SOEN models operate with temporal dynamics and physical constraints that require specialized training approaches.

The training process encompasses several key components: data preparation, loss function selection, optimization strategies, and evaluation metrics. Each component must be carefully configured to account for the unique properties of superconducting optoelectronic hardware.

Loss Functions

The choice of loss function fundamentally shapes how your SOEN model learns. Different objectives require different approaches:

• Cross Entropy - Standard classification
• Gap Loss - Margin-based robust learning
• Custom Losses - Task-specific objectives

📚 Detailed Loss Functions Guide →

Optimization

SOEN models benefit from adaptive optimization algorithms that can handle the unique parameter landscapes of superconducting circuits.

• AdamW - Adaptive learning with weight decay
• Learning Rate Scheduling - Dynamic rate adjustment
• Gradient Clipping - Stability for physical parameters

📝 Detailed optimization guide - page yet to be populated

Data Handling

Preparing data for SOEN models requires consideration of temporal dynamics and input encoding schemes.

• Temporal Sequences - Time-series data handling
• Input Encoding - Raw vs. one-hot encoding
• Batch Processing - Efficient data loading

📝 Comprehensive data guide - page yet to be populated

Evaluation

Evaluating SOEN models requires metrics that capture both accuracy and the unique aspects of temporal neural dynamics.

• Classification Metrics - Accuracy, top-k accuracy
• Sequence Metrics - Perplexity, bits per character
• Temporal Analysis - Convergence dynamics

📝 Evaluation metrics guide - page yet to be populated

Current Method: YAML Configuration

Currently, SOEN experiments are defined using YAML configuration files. This approach provides a structured way to specify all training parameters, from basic settings like batch size and learning rate to complex multi-loss objectives and advanced callbacks.

Note: This is the current method for defining experiments. Future versions may include additional configuration approaches and programmatic APIs.

Key Configuration Sections

Experiment Metadata - Description, seed, reproducibility

Training Settings - Batch size, epochs, autoregressive mode

Data Configuration - Paths, preprocessing, encoding

Model Parameters - Architecture, simulation settings

Callbacks - Learning rate scheduling, early stopping

Logging - Metrics tracking, checkpoints, debugging

📚 See Configuration Examples

Detailed YAML Configuration Reference

Website work in progress - this section shows the current YAML-based configuration system

Experiment Metadata

description

"A brief, human-readable description of the experiment's goal."

seed

A base seed for ensuring reproducibility.

Training Parameters

Basic Settings

batch_size: 64

max_epochs: 100

Optimizer

name: "adamw" # Options: "adamw", "adam", "lion", etc.

lr: 0.001

kwargs.weight_decay: 1e-4

Loss Functions

💡 Learn More: Understanding different loss functions is crucial for effective training.

📚 Detailed Loss Functions Guide →

Example: Cross-Entropy Loss

name: cross_entropy weight: 1.0

Data

Basic Settings

data_path: "path/to/your/dataset.h5" # HDF5/NPZ or custom loader path

cache_data: true # Load into RAM if it fits

num_classes: 10

val_split: 0.2 test_split: 0.1

Sequence Processing

sequence_length: 100

target_seq_len: null # Resample length (null keeps original)

min_scale: -1.0 max_scale: 1.0

Input Encoding

input_encoding: "raw" # raw | one_hot | embedding

vocab_size: null # required for one_hot

one_hot_dtype: "float32"

Optional Splits

csv_data_paths: (train.csv, val.csv, test.csv) supported

Synthetic Dataset

synthetic: false

synthetic_kwargs: none

Model

Model Loading

base_model_path: "path/to/your/base_model.pth"

load_exact_model_state: false

Time Pooling

time_pooling.params.scale: 1.0

range_start / range_end supported for mean_range

Simulation

dt: 195.3125

dt_learnable: false

Logging

Project Structure

project_dir: "experiments/"

project_name: "SOEN_Experiments"

group_name: "My_Experiment_Group"

experiment_name: null

Creates project_/experiment_/group_ folders under project_dir

Metrics & Frequency

metrics: ["accuracy", "perplexity", "bits_per_character"]

log_freq: 50 log_batch_metrics: true

log_level: "INFO"

log_gradients: false

track_layer_params: false track_connections: false

Cloud Upload

upload_logs_and_checkpoints: false

s3_upload_url: null

Probing

dynamic_variables_probing.enabled: false

dynamic_variables_probing.log_sample0_output_bar_chart: false

Callbacks

Learning Rate Scheduler

lr_scheduler.type: "constant"

lr_scheduler.lr: 0.001

Other options:

cosine: max_lr, min_lr, warmup_epochs, cycle_epochs, enable_restarts, ...
linear: max_lr, min_lr, log_space
greedy/adaptive: factors, patience, warmup, intra_epoch, etc.

Early Stopping

Optional: monitor, patience, mode

Loss Weight Schedulers

Optional: per-loss schedulers (sinusoidal, linear, exponential_decay)

🚀 Getting Started

Quick Start

1. Prepare your dataset in the required format
2. Choose appropriate loss functions for your task
3. Configure training parameters via YAML
4. Run training using the provided scripts

Resources

📚 Loss Functions Guide 💡 Training Examples 🔧 GUI Configuration Tools