Build Your First Neural Network with TensorFlow or PyTorch


Build Your First Neural Network with TensorFlow or PyTorch

Neural networks are the foundational components of modern deep learning systems. They enable machines to learn patterns from data and make predictions on unseen inputs. Frameworks like TensorFlow and PyTorch provide robust tools and APIs to design, train, and deploy these networks across a wide range of tasks.

Whether you’re working on image classification, natural language processing, or time-series prediction, mastering at least one of these frameworks is essential.

This guide walks you through building a basic feedforward neural network using both TensorFlow (via Keras) and PyTorch. You’ll learn how to load data, define a model, train it, evaluate performance, and compare workflows across the two platforms.

Prerequisites

Before getting started, ensure you have the following:

  • Python 3.6 or higher
  • Basic knowledge of Python and NumPy
  • Familiarity with machine learning fundamentals (optional but helpful)
  • Installed required libraries:

Install either:

pip install tensorflow

or:

pip install torch torchvision

Dataset: MNIST Handwritten Digit Classification

We will use the MNIST dataset, which contains 70,000 grayscale images of handwritten digits from 0 to 9. Each image is 28×28 pixels. The goal is to build a neural network that correctly classifies each image into one of the 10 digit categories.

Option 1: Building the Neural Network with TensorFlow (Keras)

Step 1: Import Libraries

import tensorflow as tf
from tensorflow.keras import layers, models

Step 2: Load and Prepare the Data

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values to the range [0, 1]
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

Step 3: Define the Neural Network Model

model = models.Sequential([
layers.Flatten(input_shape=(28, 28)), # Converts 28x28 matrix to 784-dimensional vector
layers.Dense(128, activation='relu'), # Fully connected hidden layer
layers.Dense(10, activation='softmax') # Output layer with 10 units for 10 classes
])

Step 4: Compile the Model

model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

Step 5: Train the Model

model.fit(
x_train,
y_train,
epochs=5,
batch_size=32,
validation_split=0.1
)

Step 6: Evaluate Model Performance

test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)

Option 2: Building the Neural Network with PyTorch

Step 1: Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

Step 2: Load and Preprocess Data

# Transform converts images to tensor and normalizes to [0,1]
transform = transforms.Compose([transforms.ToTensor()])

train_dataset = datasets.MNIST(root='.', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='.', train=False, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32)

Step 3: Define the Neural Network

class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.fc1 = nn.Linear(28 * 28, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = self.flatten(x)
x = self.fc1(x)
x = self.relu(x)
return self.fc2(x)

model = NeuralNetwork()

Step 4: Define Loss Function and Optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Step 5: Train the Model

for epoch in range(5):
model.train()
for images, labels in train_loader:
optimizer.zero_grad()
output = model(images)
loss = criterion(output, labels)
loss.backward()
optimizer.step()
print(f"Epoch {epoch + 1} completed")

Step 6: Evaluate on Test Data

model.eval()
correct = 0
total = 0

with torch.no_grad():
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()

print("Test accuracy:", correct / total)

Comparison: TensorFlow (Keras) vs PyTorch

FeatureTensorFlow (Keras)PyTorch
API StyleDeclarative and high-levelImperative and more flexible
Learning CurveEasier for beginnersMore granular control for advanced users
Dynamic GraphAvailable but optionalDynamic by default
Community UsageWidely used in industry and productionWidely used in academic and research
Deployment OptionsTensorFlow Lite, TensorFlow.js, TF ServingTorchScript, ONNX
EcosystemRich ecosystem including TFX, Keras, etc.Integrated well with Pythonic tooling

Suggested Enhancements

Here are some ideas to improve your neural network beyond this basic version:

  1. Add Dropout Layers to prevent overfitting:
    • layers.Dropout(0.3) in Keras or nn.Dropout(0.3) in PyTorch
  2. Use Convolutional Neural Networks (CNNs):
    • Especially effective for image data like MNIST
  3. Track Metrics with TensorBoard or WandB:
    • Use tf.keras.callbacks.TensorBoard in TensorFlow
    • Use wandb or tensorboardX for PyTorch
  4. Enable GPU Acceleration:
    • TensorFlow: with tf.device('/GPU:0')
    • PyTorch: model.to('cuda'), images = images.to('cuda')
  5. Save and Load Models:
    • TensorFlow: model.save('model_path'), tf.keras.models.load_model('model_path')
    • PyTorch: torch.save(model.state_dict(), 'model.pth'), model.load_state_dict(torch.load('model.pth'))

Final Thoughts

Both TensorFlow and PyTorch provide a comprehensive suite of tools for building, training, and deploying deep learning models. This introductory project using MNIST helps you understand how these frameworks work in practice and lays the groundwork for more advanced tasks.

As you continue your deep learning journey, experiment with more complex datasets, architectures like CNNs or RNNs, and real-world applications such as object detection or sentiment analysis.


Leave a Comment

Your email address will not be published. Required fields are marked *