Goal
This project implements a fully customizable neural network using just Python and NumPy. It supports various features like multiple layers, different activation functions, various optimization functions, cross-entropy loss, binary classification, dropout, and L2 regularization.
The project aims to demonstrate how neural networks can be built and trained using only core Python libraries, without relying on high-level frameworks like TensorFlow or PyTorch. By doing this, you gain deeper insight into the underlying mechanisms of backpropagation, gradient descent, and optimization algorithms.
ml_utils.py
This module provides custom implementations of neural network utilities, including clustering algorithms, optimizers, and loss functions.
Classes
- AdamOptimizer: Implements the Adam optimization algorithm for training neural networks. It includes parameters such as learning rate, beta values for moment estimates, and a regularization parameter.
- CrossEntropyLoss: Custom implementation of the cross-entropy loss for multi-class classification, calculated using the formula:
Loss = - (1/m) * Σ (y * log(p) + (1 - y) * log(1 - p))
wherem
is the number of samples. - BCEWithLogitsLoss: Custom binary cross-entropy loss implementation with logits. It calculates loss using the formula:
Loss = -mean(y * log(p) + (1 - y) * log(1 - p))
This class applies the sigmoid function to logits to obtain probabilities. - NeuralNetwork: A class for training and evaluating a custom neural network model. Key features include:
- Supports multiple layers with customizable sizes and activation functions.
- Implements forward and backward propagation.
- Supports dropout regularization and L2 regularization.
- Includes a method for training with mini-batch gradient descent, along with early stopping.
- Provides functionality for hyperparameter tuning via grid search.
- Evaluates model performance on training and test data.
- Layer: Represents a single layer in the neural network. Contains attributes for
weights
,biases
,activation_function
, andgradients
.- Methods:
activate(Z)
: Applies the specified activation function.activation_derivative(Z)
: Returns the derivative of the activation function for backpropagation.
- Activation: Contains static methods for various activation functions and their derivatives.
relu(z)
,relu_derivative(z)
: ReLU activation and its derivative.leaky_relu(z, alpha)
,leaky_relu_derivative(z, alpha)
: Leaky ReLU activation and its derivative.tanh(z)
,tanh_derivative(z)
: Tanh activation and its derivative.sigmoid(z)
,sigmoid_derivative(z)
: Sigmoid activation and its derivative.softmax(z)
: Softmax activation function.
Testing Functions
- test_breast_cancer(): Evaluates the network on the Breast Cancer dataset, providing accuracy and classification metrics.
- test_iris(): Evaluates the network on the Iris dataset, offering similar metrics for model performance.
Example Usage
# Example usage:
from ml_utils import NeuralNetwork, AdamOptimizer
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Load dataset and prepare data
data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Initialize neural network and optimizer
network = NeuralNetwork(layers=[...], activations=[...])
optimizer = AdamOptimizer(network=network)
# Train and evaluate
optimizer.train(X_train, y_train)
predictions = network.predict(X_test)
print(predictions)