Cole's Blog

Computer Vision / Streamlit / Memory Management

Cole Kujawa — Fri, 21 Jun 2024 02:28:52 GMT

While working on this project I learned a lot about the different models used for Computer Vision. I also got a grip on Streamlit which is a pure python front end, I've already really started to like Streamlit for it's simplicity, although I won't be covering Streamlit in this post.

Along the way I ran into some issues with RAM usage and learned some new ways to prevent excessive memory usage and proper methods for releasing RAM after using models. One issue I was not able to overcome was with larger LLMs, I simply did not have enough RAM to load them into memory to use them. I ended up opting for OpenAIs API to accomplish the generative text portion of the project. GPT2 did load but it's performance was abysmal.

GitHub - cskujawa/example-app-cv-model-docker-compose: Computer Vision app

Computer Vision app. Contribute to cskujawa/example-app-cv-model-docker-compose development by creating an account on GitHub.

GitHubcskujawa

RAM Usage

I only figured out two ways to free RAM, I needed to either delete the variable or free up the Torch resource (which I was not using in this project.)

    # Clear memory
    del model
    #torch.cuda.empty_cache() if torch.cuda.is_available() else None

Now, my server isn't exactly light on RAM, it hosts a lot of different applications and services, but it has a solid 32GB of EEC RAM. It's not a huge quantity, but it has served me well up to this point. When it comes to LLMs though, it's not enough. Below is a screenshot of Netdata with the usage stats at idle.

Below is a screenshot of the same usage stats after starting to load an LLM via transformers.

Lastly, the usage stats just before the server crashed. Nearly 100% RAM usage and 90% swap usage, not good.

Ultimately I got it to work using the OpenAI API for text generation instead of a locally hosted LLM. That's okay, in the future I'll get it sorted out. Regardless of the text generation though, I got the rest of the models loaded in and working. It was pretty easy too when all the models I was testing with were in the Tensorflow Keras Applications package.


from tensorflow.keras.applications.efficientnet import EfficientNetB0, preprocess_input as efficientnet_preprocess_input, decode_predictions as efficientnet_decode_predictions
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input as resnet_preprocess_input, decode_predictions as resnet_decode_predictions
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input as inception_preprocess_input, decode_predictions as inception_decode_predictions
from tensorflow.keras.applications.mobilenet import MobileNet, preprocess_input as mobilenet_preprocess_input, decode_predictions as mobilenet_decode_predictions
from tensorflow.keras.applications.densenet import DenseNet121, preprocess_input as densenet_preprocess_input, decode_predictions as densenet_decode_predictions
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input as vgg16_preprocess_input, decode_predictions as vgg16_decode_predictions

Now getting them all loaded in was fine and using Streamlit to pull together a UI made it easy. To actually glean some insight from the application though I made some changes. First of all I made it so all the results from each model were placed in the same table, and then I sorted the table by the probability (the confidence.) So whichever model was most confident was at the top with it's guess. The models aren't perfect, but at some tasks they excelled.

With a little bit of tweaking and setting some thresholds, I got it to write a poem about the image it saw, that is if at least a few of the models had more than 50% probability in their guesses.

Breaking Down AI - Understanding Convolutional Neural Networks

Cole Kujawa — Mon, 03 Jun 2024 16:29:38 GMT

Convolutional Neural Networks (CNNs) have revolutionized the field of computer vision and deep learning, becoming a cornerstone technology for image processing and recognition tasks. In this blog post, we will explore the origins of CNNs, delve into how they work, and discuss why they are so widely used across various applications.

Origins of CNNs

The concept of Convolutional Neural Networks traces back to the 1980s when Kunihiko Fukushima introduced the "Neocognitron," a model designed for pattern recognition.

The Neocognitron was inspired by the model proposed by Hubel & Wiesel in 1959. They found two types of cells in the visual primary cortex called simple cell and complex cell, and also proposed a cascading model of these two types of cells for use in pattern recognition tasks.

However, it wasn't until the 1990s that CNNs gained significant traction, largely due to the pioneering work of Yann LeCun and his colleagues. They developed the LeNet architecture, which was successfully used for digit recognition in zip codes, showcasing the practical potential of CNNs.

How CNNs Work

CNNs are specially designed for processing grid-like data, such as images. Here's a breakdown of their key components and how they function:

Convolutional Layers:
Convolutional layers apply convolution operations to the input data. A convolution operation involves a filter (or kernel) that slides over the input data, performing element-wise multiplication and summation to produce a feature map. These filters are learnable parameters, enabling the network to identify essential features such as edges, textures, and patterns.
Pooling Layers:
Pooling layers reduce the spatial dimensions of the feature maps, making computations more efficient and reducing the risk of overfitting. Common pooling operations include max pooling (selecting the maximum value within a window) and average pooling (computing the average value within a window).
Activation Functions:
Non-linear activation functions, such as ReLU (Rectified Linear Unit), introduce non-linearity into the model, allowing it to learn more complex patterns.
Fully Connected Layers:
After several convolutional and pooling layers, the feature maps are flattened and fed into fully connected (dense) layers. These layers function like traditional neural networks, where each neuron is connected to every neuron in the previous layer.
Output Layer:
The final layer produces the output, which can be a classification (e.g., softmax for multi-class classification) or regression (e.g., linear activation for continuous values).

Why CNNs Are Used

CNNs are popular for several reasons:

Spatial Hierarchy of Features:
CNNs effectively capture spatial hierarchies in images, from low-level features (like edges) to high-level features (like objects). This hierarchical learning enables robust feature extraction.
Parameter Sharing:
By using the same filter across different parts of the input, CNNs significantly reduce the number of parameters compared to fully connected networks, making them more efficient and less prone to overfitting.
Translation Invariance:
CNNs are inherently translation-invariant, meaning they can recognize objects regardless of their position in the image. This property is crucial for tasks like object detection and image classification.
Versatility:
While originally designed for image data, CNNs have been successfully applied to various types of data, including time-series data, audio signals, and even text (in the form of character-level or word-level embeddings).
Performance:
CNNs have achieved state-of-the-art performance in many computer vision tasks, including image classification (e.g., AlexNet, VGG, ResNet), object detection (e.g., YOLO, Faster R-CNN), and image segmentation (e.g., U-Net).

Applications of CNNs

CNNs are used in a wide range of applications:

Image Classification: Identifying objects within images (e.g., ImageNet competition).
Object Detection: Locating and classifying objects within an image (e.g., self-driving cars).
Image Segmentation: Dividing an image into segments for detailed analysis (e.g., medical imaging).
Face Recognition: Identifying and verifying individuals' faces (e.g., security systems).
Natural Language Processing: Applying CNNs to text data for tasks like sentiment analysis and language modeling.

Conclusion

Convolutional Neural Networks are a powerful and versatile tool in deep learning, particularly suited for tasks involving visual data and other structured grid-like data. Their ability to automatically and adaptively learn spatial hierarchies of features makes them invaluable for a wide range of applications. As the field of deep learning continues to evolve, CNNs will undoubtedly remain at the forefront of technological advancements.

Breaking Down AI - Exploring Image Classification with Neural Networks

Cole Kujawa — Mon, 03 Jun 2024 16:21:19 GMT

When I started this project is was based on a recommendation from a friend. I didn't know at that time what I was embarking on was a Convolutional Neural Network journey. That is what it ended up being though, to further explore the concept I wrote a companion blog post about CNNs to provide a bit more context and to learn more about them myself.

Breaking Down AI - Understanding Convolutional Neural Networks

Good Spiders BlogCole Kujawa

Full Jupyter Notebook for this post is here:
https://github.com/cskujawa/jupyter-notebooks/blob/main/ML/FullBowl/FullBowl.ipynb

Project Overview

The goal of this project was to build a model capable of classifying images of bowls as either full or empty. The project was implemented using Python in a Jupyter Notebook, leveraging the power of deep learning frameworks like Torch. Here's a step-by-step guide to the process, including detailed notes from the notebook to provide additional context and insights.

Step 1: Setting Up the Environment

Before diving into the code, it was essential to set up the environment. This included installing the necessary libraries and frameworks.

torch: Core library.
torch.utils.data.DataLoader: Utility to load data in batches.
torchvision.datasets: Contains many standard vision datasets.
torchvision.transforms: Common image transformations.
datasets.load_dataset: Function to load datasets from Hugging Face.
ViTFeatureExtractor: A feature extractor for ViT models.

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import cv2

Step 2: Preparing the Data

The data preparation phase involved loading and preprocessing the images. This step is crucial as it ensures the data is in the right format for training the model. The images were resized, normalized, and augmented to improve the model's performance and generalization.

Loading and Preprocessing Images

# Load dataset
dataset = load_dataset('training', data_dir='data')

# Split the dataset into training and validation sets
train_test_split = dataset['train'].train_test_split(test_size=0.2)
train_dataset = train_test_split['train']
eval_dataset = train_test_split['test']

# Transform function
def transform(example_batch):
    # Convert images to RGB and apply feature extraction
    images = [image.convert("RGB") for image in example_batch['image']]
    inputs = feature_extractor(images, return_tensors='pt')
    inputs['labels'] = example_batch['label']
    return inputs

# Apply transformations to datasets
train_dataset.set_transform(transform)
eval_dataset.set_transform(transform)

Step 3: Load the ViT model

This phase is where we load the pre-trained Vision Transformer (ViT) model and its corresponding feature extractor from Hugging Face. It prepares the model for image classification tasks with the specified number of output labels.

Libraries:

ViTForImageClassification: Model class for Vision Transformer.
from_pretrained: Loads a pre-trained model from Hugging Face's model hub.
num_labels: Number of output labels.
ignore_mismatched_sizes: Useful for fine-tuning a model with different input sizes.

model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224', num_labels=2, ignore_mismatched_sizes=True)

training_args = TrainingArguments(
    output_dir='./results',
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_dir='./logs',
)

Step 4: Training the Model

Training the model involved feeding it the prepared images and letting it learn the distinguishing features of full and empty bowls. This was done over multiple epochs, with the model's performance being evaluated on a validation set.

Set up optimizer and learning rate scheduler

This block sets up the optimizer and learning rate scheduler for training the model. The optimizer updates the model parameters to minimize the loss function, while the learning rate scheduler adjusts the learning rate during training to improve performance.

Libraries:

AdamW: Optimizer with weight decay fix, recommended for transformers.
get_scheduler: Utility to get a learning rate scheduler.
tqdm.auto.tqdm: Progress bar library.

from transformers import AdamW, get_scheduler
from tqdm.auto import tqdm

optimizer = AdamW(model.parameters(), lr=5e-5)

num_epochs = 3
num_training_steps = num_epochs * len(train_loader)
lr_scheduler = get_scheduler(
    name="linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=num_training_steps
)

progress_bar = tqdm(range(num_training_steps))

model.train()
for epoch in range(num_epochs):
    for batch in train_loader:
        # Move batch to device (GPU or CPU)
        batch = {k: v.to(model.device) for k, v in batch.items()}

        # Forward pass
        print(f"batch: {batch}")
        outputs = model(**batch)
        loss = outputs.loss

        # Backward pass
        loss.backward()

        # Update weights
        optimizer.step()
        lr_scheduler.step()
        optimizer.zero_grad()
        progress_bar.update(1)

        # Print loss for debugging
        progress_bar.set_postfix({"loss": loss.item()})

Step 5: Evaluating the Model

After training, ideally the model's performance would be evaluated on a separate test set, but I was tired of sourcing images for the dataset so I used the existing images. This step would usually provide insights into how well the model generalized to new, unseen data.

Libraries:

model.eval(): Sets the model to evaluation mode.
torch.no_grad(): Disables gradient calculation for faster evaluation.
batch.items(): Iterates over the items in a batch.

model.eval()
total_correct = 0
total_samples = 0

for batch in eval_loader:
    with torch.no_grad():
        # Move batch to device (GPU or CPU)
        batch = {k: v.to(model.device) for k, v in batch.items()}

        # Forward pass
        outputs = model(**batch)

        # Access logits
        logits = outputs.logits

        # Get predictions
        predictions = torch.argmax(logits, dim=-1)

        # Get true labels (assuming they are in 'labels' key of batch)
        labels = batch['labels']

        # Count correct predictions
        correct = (predictions == labels).sum().item()

        # Update totals
        total_correct += correct
        total_samples += labels.size(0)

# Calculate accuracy
accuracy = (total_correct / total_samples) * 100
print(f"Accuracy: {accuracy:.2f}%")st_generator)
print(f'Test accuracy: {test_accuracy:.2f}')

Step 6: Save the model and feature extractor

After training and evaluating I saved the trained model and the feature extractor to disk so that they can be loaded and used later without retraining.

Libraries:

save_pretrained: Saves the model and feature extractor for later use.

model.save_pretrained('models/cat_bowl_model')
feature_extractor.save_pretrained('models/cat_bowl_model')

Step 7: Making Predictions

Finally, the model was used to make predictions on new images, if only just a few (images 6 and 7 for the full and empty bowls.) This section demonstrates how to load the saved model and feature extractor, preprocess a new image, and make a prediction using the model. It also prints the predicted label for the given image.

Libraries:

PIL.Image: Python Imaging Library, used to open and manipulate images.
ViTImageProcessor: Processes images for the ViT model.
model(**inputs).logits: Forward pass to get logits (raw model outputs).
logits.argmax(-1): Gets the index of the highest logit, representing the predicted class.

from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image

# Load the fine-tuned model and image processor
model = ViTForImageClassification.from_pretrained('models/cat_bowl_model')
processor = ViTImageProcessor.from_pretrained('models/cat_bowl_model')

# Function to load and preprocess the image
def load_and_preprocess_image(image_path):
    image = Image.open(image_path)
    inputs = processor(images=image, return_tensors="pt")
    return inputs

# Function to predict if the bowl is full or empty
def predict(image_path):
    inputs = load_and_preprocess_image(image_path)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}  # Move to device
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predicted_class_idx = logits.argmax(-1).item()
    return predicted_class_idx

# Class labels
class_labels = ["empty", "full"]

# Main function for CLI
def main():
    image_path = "training/test/empty_01.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_02.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_03.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_04.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_05.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_06.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/empty_07.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_01.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_02.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_03.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_04.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_05.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_06.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")
    image_path = "training/test/full_07.jpg"
    predicted_class_idx = predict(image_path)
    print(f"The bowl in {image_path} is {class_labels[predicted_class_idx]}.")

if __name__ == "__main__":
    main()

Conclusion

And finally, the results. Different training runs have resulted in varying results. There is always a degree of inaccuracy with this project, and I believe that to be due largely to the limited dataset used for training.

There is a lot of ways that this project could be continued to further improve it's accuracy.

Data Augmentation: Although some augmentation was applied, further augmenting the dataset with more varied transformations (such as rotations, flips, and color adjustments) could help the model generalize better.
Increasing Dataset Size: Collecting more images for training could significantly enhance the model's ability to learn diverse features, leading to better performance.
Hyperparameter Tuning: Experimenting with different hyperparameters, such as learning rates, batch sizes, and optimizer choices, could yield better training outcomes.
Advanced Architectures: Implementing more advanced architectures like ResNet, VGG, or EfficientNet, which are designed to handle image classification tasks more effectively, could improve accuracy.

Getting Started with Pretrained Models from Hugging Face in Jupyter Lab

Cole Kujawa — Fri, 31 May 2024 20:01:45 GMT

Today we're going to be getting started with pretrained models from Hugging Face. If you're new to the world of machine learning or just looking to explore how to use powerful pretrained models, you're in the right place. We'll be running our project in Jupyter Lab, so make sure you have it installed and ready to go, if not you can check out my post on Jupyter Lab here. Let's dive in!

What is Hugging Face?

Hugging Face is a company that has made a significant impact in the field of natural language processing (NLP). They offer a library called transformers, which provides access to a wide variety of pretrained models for tasks like text classification, question answering, translation, and much more. These models are trained on massive datasets and can save you a ton of time and resources.

Installing the Transformers Library and Dependencies

Before we can use Hugging Face's models, we need to install the transformers library and any necessary dependencies. Open a new terminal in Jupyter Lab and run:

# Install transformers dependencies
!pip install -qU transformers
!pip install accelerate
# Had to add in the install tensorflow (which should already be installed in this container) with the [and-cuda] to get some extra functions out of it
!pip install tensorflow[and-cuda]
import transformers

This will install the library and its dependencies, allowing us to access and use the pretrained models.

Signing in to Hugging Face

Before we can access Hugging Face's models we'll need to login. You'll need your authentication information from your Hugging Face account, I use the authentication token. You can find yours here:
https://huggingface.co/settings/tokens

# Install Huggingface dependency and login using token
!pip install --upgrade huggingface_hub
from huggingface_hub import login
login()

Loading a Pretrained Model

Now, let's load a pretrained model. For this example, we'll use a sentiment analysis model to classify text as positive or negative. We'll use the pipeline API, which makes it easy to use pretrained models for various tasks.

First, let's import the necessary libraries and set up our pipeline:

from transformers import pipeline

# Load a sentiment analysis pipeline
classifier = pipeline('sentiment-analysis')

The pipeline function abstracts away the complexities of loading models and tokenizers. By specifying sentiment-analysis, we're loading a pipeline tailored for this specific task.

Performing Sentiment Analysis

With our pipeline set up, we can now perform sentiment analysis on some sample text. Let's see how it works:

# Define some text to analyze
text = "I absolutely love the new features in the latest update! It's amazing!"

# Use the classifier to analyze the sentiment
result = classifier(text)

# Print the result
print(result)

When you run this code, you'll see the model's prediction. In this case, it will likely classify the text as positive. The output will look something like this:

[{'label': 'POSITIVE', 'score': 0.9998745918273926}]

The label indicates the predicted sentiment, and the score represents the confidence level of the prediction.

Exploring More Use Cases

The transformers library supports a wide range of tasks beyond sentiment analysis. Here are a few examples:

Text Generation:

generator = pipeline('text-generation', model='gpt-2')
result = generator("Once upon a time", max_length=50, num_return_sequences=1)
print(result)

Question Answering:

question_answerer = pipeline('question-answering')
context = "Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very close to the Manhattan Bridge."
question = "Where is Hugging Face based?"
result = question_answerer(question=question, context=context)
print(result)

Translation:

translator = pipeline('translation_en_to_fr')
result = translator("Hello, how are you?")
print(result)

Conclusion

And there you have it! We've successfully used a pretrained model from Hugging Face to perform sentiment analysis in Jupyter Lab. Hugging Face's transformers library makes it incredibly easy to access and use powerful models for a variety of NLP tasks. Whether you're analyzing sentiment, generating text, answering questions, or translating languages, the possibilities are endless.

I hope this guide helps you get started with pretrained models from Hugging Face. Happy coding, and feel free to share your experiences and any cool projects you create using these amazing tools!

Happy coding!

Breaking Down AI - Rainfall Prediction Using Machine Learning

Cole Kujawa — Fri, 31 May 2024 16:30:23 GMT

AI is all around us now, it's moving so fast that at times it feels as if I'll never keep up. I'll try though. One common trend I've picked up on at work and in conversations with friends and family is that AI has become such a broad and mystical term it's almost meaningless. It could mean anything, but in reality it's been around us for decades. What is really new is truly advanced chat assistants like ChatGPT, Gemini, and Claude. For my friends family and colleagues I wanted to start a series on breaking down AI into it's constituent parts, the process of preparing data, selecting algorithms, training, evaluating, and many of the other steps that go into making these various AI services. So I'm starting this series where I will take a look at a single use case involving data science with a singular goal in mind. These will be like micro projects which may or may not work as intended, but will hopefully shed some light on the different ways AI is implemented.

Trying to explain every facet of the different processes involved in data science will be out of scope for this post, but I will try to explain what I understand and what I can as I go along.

For each blog post in this series there will be a corresponding Jupyter Labs notebook available, the notebook and dataset used for this post are here:
https://github.com/cskujawa/jupyter-notebooks/tree/main/ML/RainfallPredictions

Before diving into it, I want to talk a little about how AI is involved in predicting weather. Companies like the National Oceanic and Atmospheric Administration (NOAA) utilize sophisticated technologies and methodologies to make weather predictions. Some of which include machine learning like we will use in this blog post.

Neural Networks and Regression Models: Enhance specific aspects of forecasting, like precipitation or storm intensity predictions. -https://www.noaa.gov/ai/about

So, today I’m excited to share a project where I predicted daily rainfall (albeit inaccurately) using historical data and machine learning. This journey takes us through data preparation, visualization, model selection, training, and evaluation.

Preparing the Data

First, I loaded the dataset and began the process of cleaning it up. This involved checking for missing values, outliers, and any inconsistencies that might skew the results.

# Loading the Dataset
import pandas as pd

# Assuming the dataset is in a CSV file named 'rainfall_data.csv'
df = pd.read_csv('rainfall_data.csv')

# Display the first few rows of the dataframe
df.head()

Cleaning the Data

I checked for missing values and filled them using forward fill to ensure there were no gaps that could affect the analysis.

# Checking for missing values
df.isnull().sum()

# Fill or drop missing values if necessary
df = df.ffill()

# Alternatively, you can use backward fill
# df = df.bfill()

Trimming Unnecessary Columns

Next, I removed any columns that weren’t relevant to the analysis, keeping only what was necessary.

# Keeping only relevant columns
# Assuming 'Date' and 'PRCP' are the relevant columns
df = df[['DATE', 'PRCP']]

# Renaming the 'PRCP' column to 'Rainfall'
df = df.rename(columns={'PRCP': 'Rainfall'})

# Display the first few rows of the cleaned dataframe
df.head()

Exploring the Data

To understand the data better, I explored its structure and distribution.

# Summary statistics
df.describe()

# Checking the data types
df.info()

I had to do some additional cleaning to remove gaps in the timeline where there was no data, and I eventually ended up with this dataset.

Engineering Features

I created new features from the existing data to make it more informative for the model.

# Example: Extracting date-related features
df['DATE'] = pd.to_datetime(df['DATE'])
df['Year'] = df['DATE'].dt.year
df['Month'] = df['DATE'].dt.month
df['Day'] = df['DATE'].dt.day

Splitting the Data

I split the dataset into training and testing sets to evaluate the model’s performance accurately.

from sklearn.model_selection import train_test_split

# Define features and target variable
X = df[['Year', 'Month', 'Day']]
y = df['Rainfall']

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Choosing and Training Models

We will select a suitable machine learning model, train it on the training data, and evaluate its performance on the test data.

Linear Regression Model

A linear regression model is a statistical method used to model the relationship between a dependent variable and one or more independent variables. It assumes a linear relationship between the input variables (independent) and the output variable (dependent). The model attempts to find the best-fitting straight line (the regression line) that minimizes the difference (error) between the observed data points and the predicted values. This line is represented by the equation 𝑦=𝛽0+𝛽1𝑥+𝜖y=β0+β1x+ϵ, where 𝑦y is the dependent variable, 𝑥x is the independent variable, 𝛽0β0 is the y-intercept, 𝛽1β1 is the slope of the line, and 𝜖ϵ is the error term. Linear regression is widely used for prediction and forecasting in various fields such as economics, biology, engineering, and social sciences.

I started with a simple Linear Regression model to get a baseline performance.

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Initialize the model
lr_model = LinearRegression()

# Train the model
lr_model.fit(X_train, y_train)

# Predictions
y_pred_lr = lr_model.predict(X_test)

# Evaluation
mae_lr = mean_absolute_error(y_test, y_pred_lr)
mse_lr = mean_squared_error(y_test, y_pred_lr)

Random Forest Model

A Random Forest model is an ensemble learning method used for classification and regression tasks. It constructs multiple decision trees during training and merges their results to produce a more accurate and stable prediction. Each tree in the forest is built using a random subset of the data and a random subset of the features, which helps to reduce overfitting and improve generalization. The final prediction is obtained by averaging the predictions of all the trees (for regression) or by majority voting (for classification). Random Forest models are known for their high accuracy, robustness to noise, and ability to handle large datasets with many features. They are widely used in various domains, including finance, healthcare, and image recognition.

Next, I tried a Random Forest model, known for its ability to handle more complex data patterns.

from sklearn.ensemble import RandomForestRegressor

# Initialize the model
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)

# Train the model
rf_model.fit(X_train, y_train)

# Predictions
y_pred_rf = rf_model.predict(X_test)

# Evaluation
mae_rf = mean_absolute_error(y_test, y_pred_rf)
mse_rf = mean_squared_error(y_test, y_pred_rf)

Evaluating the Models

I compared the performance of both models using mean absolute error (MAE) and mean squared error (MSE).

# Model evaluation results
evaluation_results = {
    'Model': ['Linear Regression', 'Random Forest'],
    'MAE': [mae_lr, mae_rf],
    'MSE': [mse_lr, mse_rf]
}

# Display the evaluation results
pd.DataFrame(evaluation_results)

We compared the performance of two machine learning models: Linear Regression and Random Forest. The evaluation results for Mean Squared Error (MSE) were:

Linear Regression: 0.1007
Random Forest: 0.0901

The lower MSE value for the Random Forest model indicates that it performs better in predicting daily rainfall compared to the Linear Regression model.

Visualizing the Results

To see how well the models performed, I visualized the actual vs predicted rainfall values.

import matplotlib.pyplot as plt

# Plot actual vs predicted values for Linear Regression
plt.figure(figsize=(10, 5))
plt.plot(y_test.values, label='Actual')
plt.plot(y_pred_lr, label='Predicted - Linear Regression')
plt.legend()
plt.title('Actual vs Predicted Rainfall (Linear Regression)')
plt.show()

# Plot actual vs predicted values for Random Forest
plt.figure(figsize=(10, 5))
plt.plot(y_test.values, label='Actual')
plt.plot(y_pred_rf, label='Predicted - Random Forest')
plt.legend()
plt.title('Actual vs Predicted Rainfall (Random Forest)')
plt.show()

Reflections and Future Directions

Model Performance

The Random Forest model outperformed the Linear Regression model. It captured the trends and patterns in the data more effectively, as seen in the visualizations where predicted values closely followed the actual values.

Key Takeaways

Model Selection: The Random Forest model is better suited for predicting daily rainfall in this dataset.
Strengths: Its ability to handle non-linear relationships and interactions within the data.
Areas for Improvement: Hyperparameter tuning, advanced models, and incorporating additional features like temperature and humidity.

Future Work

To further improve the model:

Hyperparameter Tuning: Fine-tune the Random Forest parameters.
Feature Engineering: Explore additional features such as weather conditions and seasonal indicators.
Advanced Models: Experiment with LSTM or other time-series forecasting methods.
Data Quality: Ensure a comprehensive and continuous dataset, possibly integrating more data sources.

Final Thoughts

This project highlighted the importance of choosing the right model and evaluating it thoroughly. The Random Forest model proved to be a reliable tool for predicting daily rainfall, but there is a lot of room for improvement.

By continuously refining the model and incorporating new data and techniques, we can enhance its accuracy and utility in real-world applications.

Thanks for following along on this journey. Stay tuned for more explorations into the world of data science and machine learning!

Setting Up and Using Jupyter Labs with Docker Compose

Cole Kujawa — Fri, 31 May 2024 15:40:03 GMT

Introduction

Jupyter Lab is an essential tool for data scientists, providing an interactive environment for running notebooks, code, and data visualizations. Setting it up with Docker Compose ensures a consistent, isolated environment, leveraging GPU resources for enhanced performance.

Note that using the GPU version of Jupyter labs is not required, but I am using it for running ML programs and processing on CPU would be far slower.

Why Jupyter Labs?

1. Rapid Iteration:
Jupyter Lab enables quick prototyping and experimentation. By providing an interactive coding environment, you can write and test code snippets, visualize data, and see results in real-time, making it ideal for data science workflows.

2. Interactive Visualizations:
With built-in support for rich media outputs, Jupyter Lab is perfect for data visualization. It supports libraries like Matplotlib, Seaborn, and Plotly, allowing you to create and display charts, graphs, and other visualizations directly in your notebook.

3. Ease of Use:
The intuitive interface of Jupyter Lab makes it accessible to both beginners and experienced users. It supports multiple programming languages via kernels, though Python is the most commonly used.

4. Enhanced Collaboration:
Jupyter Lab's ability to share notebooks and work collaboratively with others is a significant advantage. Notebooks can be exported in various formats, including HTML and PDF, and shared with colleagues for review and collaboration.

You can find my Jupyter Notebooks here: https://github.com/cskujawa/jupyter-notebooks/tree/main

5. GPU Acceleration:
Leveraging GPUs in data science tasks can significantly speed up computations, especially for machine learning and deep learning applications. The Docker setup ensures that your Jupyter Lab environment is optimized for GPU usage, providing a performance boost for intensive tasks.

Docker Compose Definition

Here's the Docker Compose configuration I used for setting up my instance of Jupyter Lab:

Explanation

Container Configuration:
- Image: Uses cschranz/gpu-jupyter:v1.6_cuda-12.0_ubuntu-22.04, which is tailored for GPU support with CUDA 12.0 and Ubuntu 22.04.
Environment Variables:
- JUPYTER_ENABLE_LAB: Enables the Jupyter Lab interface.
- NVIDIA_DRIVER_CAPABILITIES and NVIDIA_VISIBLE_DEVICES: Required for GPU support.
- PASSWORD: Sets the password for Jupyter Lab access.
Volumes:
- Maps the host directory ./data/jupyter/ to the container's /home/jovyan/ to persist notebooks and data.
Command:
- Runs start-notebook.py to initiate the Jupyter Lab server.
Devices:
- Maps /dev/dri to allow GPU access within the container.
Deployment Resources:
- Reserves GPU resources by specifying the NVIDIA driver, device IDs, and capabilities.

A guide for using Jupyter labs is out of scope for this blog post, but Jupyter has great documentation available here:
https://jupyter-notebook.readthedocs.io/en/stable/

I've been working in my Jupyter Lab for a few months now and it has been a fantastic experience. I really wanted to set up the Jupyter Lab so I could test HuggingFace models out, try different projects, and not have to worry about building a whole app every time.

I'll be writing some future blog posts on the different projects I've tackled using it.

Future Topics

Stay tuned for future blog posts where we will use Jupyter Labs to delve into:

Generative LLMs: Exploring generative language models.
Machine Learning: Notebooks on predicting using machine learning techniques.
Sentiment Analysis: Projects on analyzing sentiment in text data.
Text Summarization: Notebooks for summarizing text.
Transcription: Projects related to transcribing audio data.

Conclusion

Setting up Jupyter Lab with Docker Compose streamlines the deployment process, especially when leveraging GPU resources. This setup ensures a consistent, isolated environment, enhancing reproducibility and performance for data science tasks. Whether you're prototyping machine learning models or visualizing complex datasets, Jupyter Lab offers a powerful and flexible platform to support your work. Happy coding!

Home Server Mk. 2

Cole Kujawa — Fri, 03 May 2024 16:27:20 GMT

With everything I learned in the development of Mk. 1 I had a better idea of what Mk. 2 was going to be.

I knew it was all going to be built on the back of Docker Compose, after completely borking up my bare metal server dozens of times, I knew I wanted as little dependencies installed locally as was physically possible. Almost every service I have running on Mk. 2 came as a pre-packaged container, with a few built from scratch API servers built on boiler plate containers.

The screenshot below is of my Heimdall container it's like a web launcher for all of my other home server apps. Heimdall is just one of the many completely free, open-source containers LinuxServer.io offers. I cannot praise them enough, their containers are phenomenal, well documented, and almost all plug-n-play. I'm not affiliated with them, but I've used a dozen of their containers and love them all. I'm not going to discuss every container seen in the screenshot, but I will cover what is essentially the backbone of Mk. 2.

To get started, it's important to note that when I refer to a container, it is generally defined by a few lines of code. The screenshot below is a fairly average length definition for a container in Docker Compose. It has a name, an image (the pre-packaged container,) some settings, and environment variables. The container below updates my CloudFlare DNS account when my IP changes.

Files & File Management with Docker Compose

I've pared down my file structure to be as simple and quick to navigate as possible. If there is any configuration files, or if I want to manipulate files inside of a container, it gets it's own little directory in the data directory.

If I want to give a container access to system files it gets a volume where the left side of the definition is the directory on the system and the right is the directory on the container. This effectively creates a live connection between the two. The below definition gives my Netdata container Read Only (:ro) access to the system files to monitor performance.

volumes:
    - /proc:/host/proc:ro
    - /sys:/host/sys:ro

Linking files and directories in containers to the local system makes it really easy to see what's going on, you can be as restrictive or permissive as is needed. Usually I use it for logs/configs as seen in the definition below.

volumes:
    - ./data/pt_wings/config:/etc/pterodactyl/
    - ./data/pt_wings/logs:/var/log/pterodactyl/

Services I Use Nearly Every Day

Some of the services I have running on Mk. 2 are set it and forget it, some are tools I use nearly every day. Some of those are Portainer, Dozzle, and Netdata. My Docker Compose definitions are almost identical to those found in their respective Docker Hub pages.

Portainer:
I don't use Portainer to create containers even though it can do that, I still write my docker-compose.yaml files by hand and usually start containers initially via CLI. That being said, it is phenomenal for getting a high level view of different projects I have going on, each docker-compose.yaml represents a "Stack" in Portainer.

When viewing a stack it gives you an overview of each container in that stack. You can easily see at a glance their name, state, ports, and you can start/stop/restart, basically most of the high use commands in the docker compose package.

Dozzle:
Dozzle is great for quickly viewing logs and Memory/CPU usage per container, Portainer also has log viewing capability but it's less feature rich than Dozzle. You can see Dozzle has a very similar stack/container layout in it's navigation as well.

Netdata:
Now, if you read my post on Mk. 1 then you know I built a Laravel application leveraging CAdvisor, Prometheus, and Grafana to create a system monitoring page. That was months of development. Well I replaced all of that with Netdata in about 5 minutes. It let's you view basically anything going on in the system, the free version has more information available than I'll ever use. It's great to see what is going on with the bare metal server itself.

Other Important Services

There's a lot more services that go into running Mk. 2. I'll cover most of them briefly here.

Docker Socket Proxy:
This is a super simple container that replaces any lines where you would otherwise need to allow a container (like Portainer) to access the Docker Socket which is used for starting/stopping/other Docker operations. It acts as a security gateway/proxy for those requests.

NGINX Proxy Manager:
I could rave about this for days, but it will require it's own article along with other advanced networking topics. For now, suffice to say it makes managing reverse proxies, caching, SSL, and HTTPS routing incredibly simple.

These are some services I basically forget exist most of the time.

DockerGC:
This just frees up resources that are left dangling in Docker environments

WatchTower:
This is great as it keeps the docker containers up to date automatically, although sometimes you may not want that.

Conclusion

That about wraps it up, those were some of the first services I got running on my home network, and they laid the ground work for dozens of projects to follow. They make managing my projects super simple.

Home Server Mk. 1

Cole Kujawa — Fri, 03 May 2024 15:09:14 GMT

Home server, home lab, tinker station, I don't know what to call the Frankenstein of a computer I've built, but it's far and away my favorite thing to play with. This is the story of how it came to be.

It started with a computer case and motherboard my former employer was dumping, it was a workstation at one point in time. I bought a used server motherboard, Xeon processor, ECC RAM, and for kicks I stuck my old GeForce 1080 GPU in it. That sentence makes it sound like it happened over night, but it took well over a year to get all of the hardware assembled into what it is today. This blog is hosted on that Frankenstein.

Now, the question remains, why? I got an A.S. in Computer Science and continued on to a B.S. in Information Technology. I thoroughly enjoyed most of the development I did, but I generally lack direction. I wanted to have a place where I could try and develop different applications, quickly iterate over designs, prototype, and most importantly scrap junk easily when needed. So I decided I'd scrap together my home server.

I started by building little bash scripts for a former employer, eventually I was developing whole pages and programs in PHP. That got old real fast, as it turns out I have a strong distaste for front end development. Laravel helped ease that pain, some. What it really did was introduce me to frameworks, my web development teacher always told me "you're going to need to know how to write HTML from scratch," == false, it never hurt though.

The way that Laravel loads it's parts was fascinating to me, you could edit a page and then refresh the browser and see the changes immediately, it was like magic. I wanted to build my own Laravel application, I had no idea what I was going to do with it, but I was going to do it. I heard another buzz word at that time and thought, I'll containerize it too!

So without a real plan I started by following this lovely guide from Digital Ocean:
https://www.digitalocean.com/community/tutorials/how-to-install-and-set-up-laravel-with-docker-compose-on-ubuntu-22-04

It was all going fairly well, learning how to prepare an application to be containerized, how to get them to communicate, but when it came time to make it available to external connections I was introduced to the boss fight. NGINX, which having never heard of it before, only read it, I pronounced it en–jinx. It was only a little embarrassing this first time a friend corrected me on that pronunciation. Learning NGINX was no joke, it was the better part of 6 months of debugging every time I needed to expose a new service. I learned all about reverse proxying, configuring HTTPS, caching, and so much more.

Once I finally had a stable application, accessible from the web, it was time to do something with it. I didn't really have any idea where I was going to start, or what I was going to do. I was worried about resource utilization, this was my first dedicated server after all and I wanted a web interface to monitor it. So I set out to learn how to monitor a system. I learned all about CAdvisor, Prometheus, and Grafana. By their powers combined I had put together a monitoring page. This would be the foundation for several projects that would be built on the back of the monolithic app that would be referred to as J.A.R.V.I.S. Just A Really Versatile Information System.

Now in many of my positions I have/or currently function as a subject matter expert, and that means being able to explain to others what something is, how it works, etc. So I wrote guides for setting up that project using WSL and on a bare metal server (Ubuntu.) I would not recommend using either of those guides or that project, it was my Home Server Mk. 1. Full of flaws, bad logic, half finished ideas, etc.

All of the skills I learned along the way prepared me for the adventure that would be my Home Server Mk. 2.

Unleashing the Benefits of Docker Compose on Bare Metal Servers

Cole Kujawa — Fri, 03 May 2024 14:07:32 GMT

In the world of software deployment, Docker has revolutionized how applications are deployed and managed across different environments. Docker Compose, a tool for defining and running multi-container Docker applications, offers an added layer of convenience and efficiency, particularly when used on bare metal servers. This blog post explores the significant benefits of using Docker Compose on a bare metal server, from improved resource utilization to simplified operational procedures.

Enhanced Performance and Resource Utilization:
One of the standout benefits of using Docker Compose on a bare metal server is the direct access to hardware resources, which translates into enhanced performance. In my home lab I can add about 10 lines to any container definition to give the container access to my GPU. A few lines can define where and how a container stores data, static IPs, subnets, resource restrictions, etc.

Simplified Management and Scalability:
Docker Compose simplifies the management of container-based applications. By defining your multi-container setup in a single YAML file, you can manage the entire lifecycle of your application stack with simple commands. This not only makes setup and teardown incredibly efficient but also ensures consistency across different development, testing, and production environments. I've completely wiped and reinstalled the OS on my bare metal server and had my entire home lab back up and running in 10 minutes. It's as simple as cloning the project definition and any persistent data files and then starting the project.

Consistency Across Environments:
Using Docker Compose on bare metal servers can significantly reduce the "it works on my machine" syndrome. The containers encapsulate all dependencies, ensuring that the application runs the same way, regardless of where it is deployed. This consistency is crucial for reducing bugs and errors that typically arise from environmental discrepancies during deployment.

Isolation and Security:
Each container managed by Docker Compose runs in isolation, sharing only the kernel and essential resources. This isolation helps in minimizing conflict between running applications and enhances security by limiting the surface area for potential attacks. Regular updates and easy rollback features further bolster security and application stability.

Some Great Docker Compose Features:

Want your frontend to wait for the database to load before starting, it's just two lines

depends_on:
    - database

This can be expanded to include a health check operation as well

depends_on:
    database:
        condition: service_healthy

Most services have health checks pre-written somewhere on the internet and can be used easily, for instance this is one I use for MariaDB which I snagged from somewhere.

healthcheck:
    interval: 30s
    retries: 3
    test:
        [
          "CMD",
          "healthcheck.sh",
          "--su-mysql",
          "--connect",
          "--innodb_initialized"
        ]
    timeout: 30s

Creating persistent volumes allows you to completely nuke a container and restart with all of it's stored file intact

volumes:
  # MySQL Database
  db-data: {}

Proxying ports is super simple, these two port numbers represent one inside the container (right) and the one that will be exposed on the bare metal server (left.) This makes it possible to run any number of applications that share the same default port, and you only have to change 1 number in the docker-compose.yaml.

ports:
    - 8081:8080 #Proxy port to 8080 to 8081 on host

Environment variables allow you to securely define protected information and access it in code as a variable name

environment:
    MYSQL_DATABASE: ${DB_DATABASE}
    MYSQL_PASSWORD: ${DB_PASSWORD}
    MYSQL_USER: ${DB_USERNAME}

Conclusion:
Embracing Docker Compose on bare metal servers brings forth a plethora of benefits that can significantly enhance application deployment, performance, and management. By reducing overhead, simplifying configurations, and ensuring consistency across environments, Docker Compose stands out as a vital tool for modern IT infrastructure. Whether you’re managing complex applications or simple service stacks, Docker Compose paired with bare metal can be a game-changer.

More Information:

Installing Docker (Ubuntu) - https://docs.docker.com/engine/install/ubuntu/
Installing Compose Plugin (Linux) - https://docs.docker.com/compose/install/linux/
How Compose Works - https://docs.docker.com/compose/compose-application-model/