Training SOEN Models
Understanding how to train superconducting optoelectronic neural networks
Training Overview
Training SOEN models involves optimizing the parameters of superconducting circuits to perform specific computational tasks. Unlike traditional neural networks, SOEN models operate with temporal dynamics and physical constraints that require specialized training approaches.
The training process encompasses several key components: data preparation, loss function selection, optimization strategies, and evaluation metrics. Each component must be carefully configured to account for the unique properties of superconducting optoelectronic hardware.
Loss Functions
The choice of loss function fundamentally shapes how your SOEN model learns. Different objectives require different approaches:
- • Cross Entropy - Standard classification
- • Gap Loss - Margin-based robust learning
- • Custom Losses - Task-specific objectives
Optimization
SOEN models benefit from adaptive optimization algorithms that can handle the unique parameter landscapes of superconducting circuits.
- • AdamW - Adaptive learning with weight decay
- • Learning Rate Scheduling - Dynamic rate adjustment
- • Gradient Clipping - Stability for physical parameters
Data Handling
Preparing data for SOEN models requires consideration of temporal dynamics and input encoding schemes.
- • Temporal Sequences - Time-series data handling
- • Input Encoding - Raw vs. one-hot encoding
- • Batch Processing - Efficient data loading
Evaluation
Evaluating SOEN models requires metrics that capture both accuracy and the unique aspects of temporal neural dynamics.
- • Classification Metrics - Accuracy, top-k accuracy
- • Sequence Metrics - Perplexity, bits per character
- • Temporal Analysis - Convergence dynamics
Current Method: YAML Configuration
Currently, SOEN experiments are defined using YAML configuration files. This approach provides a structured way to specify all training parameters, from basic settings like batch size and learning rate to complex multi-loss objectives and advanced callbacks.
Note: This is the current method for defining experiments. Future versions may include additional configuration approaches and programmatic APIs.
Key Configuration Sections
Detailed YAML Configuration Reference
Website work in progress - this section shows the current YAML-based configuration system
Experiment Metadata
Experiment Metadata
description
"A brief, human-readable description of the experiment's goal."
seed
42
A base seed for ensuring reproducibility.
Training Parameters
Training Parameters
Basic Settings
batch_size: 64
max_epochs: 100
Optimizer
name: "adamw" # Options: "adamw", "adam", "lion", etc.
lr: 0.001
kwargs.weight_decay: 1e-4
Loss Functions
💡 Learn More: Understanding different loss functions is crucial for effective training.
📚 Detailed Loss Functions Guide →Example: Cross-Entropy Loss
Data
Data
Basic Settings
data_path: "path/to/your/dataset.h5" # HDF5/NPZ or custom loader path
cache_data: true # Load into RAM if it fits
num_classes: 10
val_split: 0.2 test_split: 0.1
Sequence Processing
sequence_length: 100
target_seq_len: null # Resample length (null keeps original)
min_scale: -1.0 max_scale: 1.0
Input Encoding
input_encoding: "raw" # raw | one_hot | embedding
vocab_size: null # required for one_hot
one_hot_dtype: "float32"
Optional Splits
csv_data_paths: (train.csv, val.csv, test.csv) supported
Synthetic Dataset
synthetic: false
synthetic_kwargs: none
Model
Model
Model Loading
base_model_path: "path/to/your/base_model.pth"
load_exact_model_state: false
Time Pooling
time_pooling.name: "final" # max | mean | rms | final | mean_last_n | mean_range | ewa
time_pooling.params.scale: 1.0
range_start / range_end supported for mean_range
Simulation
dt: 195.3125
dt_learnable: false
Logging
Logging
Project Structure
project_dir: "experiments/"
project_name: "SOEN_Experiments"
group_name: "My_Experiment_Group"
experiment_name: null
Creates project_/experiment_/group_ folders under project_dir
Metrics & Frequency
metrics: ["accuracy", "perplexity", "bits_per_character"]
log_freq: 50 log_batch_metrics: true
log_level: "INFO"
log_gradients: false
track_layer_params: false track_connections: false
Cloud Upload
upload_logs_and_checkpoints: false
s3_upload_url: null
Probing
dynamic_variables_probing.enabled: false
dynamic_variables_probing.log_sample0_output_bar_chart: false
Callbacks
Callbacks
Learning Rate Scheduler
lr_scheduler.type: "constant"
lr_scheduler.lr: 0.001
- cosine: max_lr, min_lr, warmup_epochs, cycle_epochs, enable_restarts, ...
- linear: max_lr, min_lr, log_space
- greedy/adaptive: factors, patience, warmup, intra_epoch, etc.
Early Stopping
Optional: monitor, patience, mode
Loss Weight Schedulers
Optional: per-loss schedulers (sinusoidal, linear, exponential_decay)
🚀 Getting Started
Quick Start
- 1. Prepare your dataset in the required format
- 2. Choose appropriate loss functions for your task
- 3. Configure training parameters via YAML
- 4. Run training using the provided scripts