Goal

This project implements a fully customizable neural network using just Python and NumPy. It supports various features like multiple layers, different activation functions, various optimization functions, cross-entropy loss, binary classification, dropout, and L2 regularization.

The project aims to demonstrate how neural networks can be built and trained using only core Python libraries, without relying on high-level frameworks like TensorFlow or PyTorch. By doing this, you gain deeper insight into the underlying mechanisms of backpropagation, gradient descent, and optimization algorithms.

ml_utils.py

This module provides custom implementations of neural network utilities, including clustering algorithms, optimizers, and loss functions.

Classes

  • AdamOptimizer: Implements the Adam optimization algorithm for training neural networks. It includes parameters such as learning rate, beta values for moment estimates, and a regularization parameter.
  • CrossEntropyLoss: Custom implementation of the cross-entropy loss for multi-class classification, calculated using the formula:
    Loss = - (1/m) * Σ (y * log(p) + (1 - y) * log(1 - p))
    where m is the number of samples.
  • BCEWithLogitsLoss: Custom binary cross-entropy loss implementation with logits. It calculates loss using the formula:
    Loss = -mean(y * log(p) + (1 - y) * log(1 - p))
    This class applies the sigmoid function to logits to obtain probabilities.
  • NeuralNetwork: A class for training and evaluating a custom neural network model. Key features include:
    • Supports multiple layers with customizable sizes and activation functions.
    • Implements forward and backward propagation.
    • Supports dropout regularization and L2 regularization.
    • Includes a method for training with mini-batch gradient descent, along with early stopping.
    • Provides functionality for hyperparameter tuning via grid search.
    • Evaluates model performance on training and test data.
  • Layer: Represents a single layer in the neural network. Contains attributes for weights, biases, activation_function, and gradients.
    • Methods:
    • activate(Z): Applies the specified activation function.
    • activation_derivative(Z): Returns the derivative of the activation function for backpropagation.
  • Activation: Contains static methods for various activation functions and their derivatives.
    • relu(z), relu_derivative(z): ReLU activation and its derivative.
    • leaky_relu(z, alpha), leaky_relu_derivative(z, alpha): Leaky ReLU activation and its derivative.
    • tanh(z), tanh_derivative(z): Tanh activation and its derivative.
    • sigmoid(z), sigmoid_derivative(z): Sigmoid activation and its derivative.
    • softmax(z): Softmax activation function.

Testing Functions

  • test_breast_cancer(): Evaluates the network on the Breast Cancer dataset, providing accuracy and classification metrics.
  • test_iris(): Evaluates the network on the Iris dataset, offering similar metrics for model performance.

Example Usage

# Example usage:
from ml_utils import NeuralNetwork, AdamOptimizer
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load dataset and prepare data
data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Initialize neural network and optimizer
network = NeuralNetwork(layers=[...], activations=[...])
optimizer = AdamOptimizer(network=network)

# Train and evaluate
optimizer.train(X_train, y_train)
predictions = network.predict(X_test)
print(predictions)