Build Your First Neural Network with TensorFlow or PyTorch

Neural networks are the foundational components of modern deep learning systems. They enable machines to learn patterns from data and make predictions on unseen inputs. Frameworks like TensorFlow and PyTorch provide robust tools and APIs to design, train, and deploy these networks across a wide range of tasks.

Whether you’re working on image classification, natural language processing, or time-series prediction, mastering at least one of these frameworks is essential.

This guide walks you through building a basic feedforward neural network using both TensorFlow (via Keras) and PyTorch. You’ll learn how to load data, define a model, train it, evaluate performance, and compare workflows across the two platforms.

Prerequisites

Before getting started, ensure you have the following:

Python 3.6 or higher
Basic knowledge of Python and NumPy
Familiarity with machine learning fundamentals (optional but helpful)
Installed required libraries:

Install either:

pip install tensorflow

or:

pip install torch torchvision

Dataset: MNIST Handwritten Digit Classification

We will use the MNIST dataset, which contains 70,000 grayscale images of handwritten digits from 0 to 9. Each image is 28×28 pixels. The goal is to build a neural network that correctly classifies each image into one of the 10 digit categories.

Option 1: Building the Neural Network with TensorFlow (Keras)

Step 1: Import Libraries

import tensorflow as tf
from tensorflow.keras import layers, models

Step 2: Load and Prepare the Data

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values to the range [0, 1]
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

Step 3: Define the Neural Network Model

model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),           # Converts 28x28 matrix to 784-dimensional vector
    layers.Dense(128, activation='relu'),           # Fully connected hidden layer
    layers.Dense(10, activation='softmax')          # Output layer with 10 units for 10 classes
])

Step 4: Compile the Model

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

Step 5: Train the Model

model.fit(
    x_train,
    y_train,
    epochs=5,
    batch_size=32,
    validation_split=0.1
)

Step 6: Evaluate Model Performance

test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)

Option 2: Building the Neural Network with PyTorch

Step 1: Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

Step 2: Load and Preprocess Data

# Transform converts images to tensor and normalizes to [0,1]
transform = transforms.Compose([transforms.ToTensor()])

train_dataset = datasets.MNIST(root='.', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='.', train=False, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32)

Step 3: Define the Neural Network

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        return self.fc2(x)

model = NeuralNetwork()

Step 4: Define Loss Function and Optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Step 5: Train the Model

for epoch in range(5):
    model.train()
    for images, labels in train_loader:
        optimizer.zero_grad()
        output = model(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch + 1} completed")

Step 6: Evaluate on Test Data

model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print("Test accuracy:", correct / total)

Comparison: TensorFlow (Keras) vs PyTorch

Feature	TensorFlow (Keras)	PyTorch
API Style	Declarative and high-level	Imperative and more flexible
Learning Curve	Easier for beginners	More granular control for advanced users
Dynamic Graph	Available but optional	Dynamic by default
Community Usage	Widely used in industry and production	Widely used in academic and research
Deployment Options	TensorFlow Lite, TensorFlow.js, TF Serving	TorchScript, ONNX
Ecosystem	Rich ecosystem including TFX, Keras, etc.	Integrated well with Pythonic tooling

Suggested Enhancements

Here are some ideas to improve your neural network beyond this basic version:

Add Dropout Layers to prevent overfitting:
- layers.Dropout(0.3) in Keras or nn.Dropout(0.3) in PyTorch
Use Convolutional Neural Networks (CNNs):
- Especially effective for image data like MNIST
Track Metrics with TensorBoard or WandB:
- Use tf.keras.callbacks.TensorBoard in TensorFlow
- Use wandb or tensorboardX for PyTorch
Enable GPU Acceleration:
- TensorFlow: with tf.device('/GPU:0')
- PyTorch: model.to('cuda'), images = images.to('cuda')
Save and Load Models:
- TensorFlow: model.save('model_path'), tf.keras.models.load_model('model_path')
- PyTorch: torch.save(model.state_dict(), 'model.pth'), model.load_state_dict(torch.load('model.pth'))

Final Thoughts

Both TensorFlow and PyTorch provide a comprehensive suite of tools for building, training, and deploying deep learning models. This introductory project using MNIST helps you understand how these frameworks work in practice and lays the groundwork for more advanced tasks.

As you continue your deep learning journey, experiment with more complex datasets, architectures like CNNs or RNNs, and real-world applications such as object detection or sentiment analysis.

Shreyash Mhashilkar

I’m Shreyash Mhashilkar, an IT professional who loves building user-friendly, scalable digital solutions. Outside of coding, I enjoy researching new places, learning about different cultures, and exploring how technology shapes the way we live and travel. I share my experiences and discoveries to help others explore new places, cultures, and ideas with curiosity and enthusiasm.