<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Cole's Blog]]></title><description><![CDATA[That's the problem, last time my legs weren't flattened]]></description><link>https://blog.good-spiders.com/</link><image><url>https://blog.good-spiders.com/favicon.png</url><title>Cole&apos;s Blog</title><link>https://blog.good-spiders.com/</link></image><generator>Ghost 5.82</generator><lastBuildDate>Mon, 04 May 2026 07:26:19 GMT</lastBuildDate><atom:link href="https://blog.good-spiders.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Computer Vision / Streamlit / Memory Management]]></title><description><![CDATA[<p>While working on this project I learned  a lot about the different models used for Computer Vision. I also got a grip on Streamlit which is a pure python front end, I&apos;ve already really started to like Streamlit for it&apos;s simplicity, although I won&apos;t</p>]]></description><link>https://blog.good-spiders.com/computer-vision-showdown/</link><guid isPermaLink="false">666bafd5fe6cdf0001202dbf</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 21 Jun 2024 02:28:52 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/06/cheetah.png" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/06/cheetah.png" alt="Computer Vision / Streamlit / Memory Management"><p>While working on this project I learned  a lot about the different models used for Computer Vision. I also got a grip on Streamlit which is a pure python front end, I&apos;ve already really started to like Streamlit for it&apos;s simplicity, although I won&apos;t be covering Streamlit in this post.</p><p>Along the way I ran into some issues with RAM usage and learned some new ways to prevent excessive memory usage and proper methods for releasing RAM after using models. One issue I was not able to overcome was with larger LLMs, I simply did not have enough RAM to load them into memory to use them. I ended up opting for OpenAIs API to accomplish the generative text portion of the project. GPT2 did load but it&apos;s performance was abysmal.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/cskujawa/example-app-cv-model-docker-compose/tree/main"><div class="kg-bookmark-content"><div class="kg-bookmark-title">GitHub - cskujawa/example-app-cv-model-docker-compose: Computer Vision app</div><div class="kg-bookmark-description">Computer Vision app. Contribute to cskujawa/example-app-cv-model-docker-compose development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/assets/pinned-octocat-093da3e6fa40.svg" alt="Computer Vision / Streamlit / Memory Management"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">cskujawa</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/45d2a54a99a64cc3772dad757ff3609c9c1f5a684cbe5eab4794dae21a13ffe8/cskujawa/example-app-cv-model-docker-compose" alt="Computer Vision / Streamlit / Memory Management"></div></a></figure><h2 id="ram-usage">RAM Usage</h2><p>I only figured out two ways to free RAM, I needed to either delete the variable or free up the Torch resource (which I was not using in this project.)</p><pre><code>    # Clear memory
    del model
    #torch.cuda.empty_cache() if torch.cuda.is_available() else None</code></pre><p>Now, my server isn&apos;t exactly light on RAM, it hosts a lot of different applications and services, but it has a solid 32GB of EEC RAM. It&apos;s not a huge quantity, but it has served me well up to this point. When it comes to LLMs though, it&apos;s not enough. Below is a screenshot of Netdata with the usage stats at idle.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/06/Server_At_Rest.png" class="kg-image" alt="Computer Vision / Streamlit / Memory Management" loading="lazy" width="1914" height="938" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/06/Server_At_Rest.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/06/Server_At_Rest.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/06/Server_At_Rest.png 1600w, https://blog.good-spiders.com/content/images/2024/06/Server_At_Rest.png 1914w" sizes="(min-width: 1200px) 1200px"></figure><p>Below is a screenshot of the same usage stats after starting to load an LLM via transformers.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/06/Server_Loading_LLM.png" class="kg-image" alt="Computer Vision / Streamlit / Memory Management" loading="lazy" width="1915" height="944" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/06/Server_Loading_LLM.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/06/Server_Loading_LLM.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/06/Server_Loading_LLM.png 1600w, https://blog.good-spiders.com/content/images/2024/06/Server_Loading_LLM.png 1915w" sizes="(min-width: 1200px) 1200px"></figure><p>Lastly, the usage stats just before the server crashed. Nearly 100% RAM usage and 90% swap usage, not good.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/06/Server_Cutoff_LLM-1.png" class="kg-image" alt="Computer Vision / Streamlit / Memory Management" loading="lazy" width="1917" height="956" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/06/Server_Cutoff_LLM-1.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/06/Server_Cutoff_LLM-1.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/06/Server_Cutoff_LLM-1.png 1600w, https://blog.good-spiders.com/content/images/2024/06/Server_Cutoff_LLM-1.png 1917w" sizes="(min-width: 1200px) 1200px"></figure><p>Ultimately I got it to work using the OpenAI API for text generation instead of a locally hosted LLM. That&apos;s okay, in the future I&apos;ll get it sorted out. Regardless of the text generation though, I got the rest of the models loaded in and working. It was pretty easy too when all the models I was testing with were in the Tensorflow Keras Applications package.</p><pre><code class="language-Python">
from tensorflow.keras.applications.efficientnet import EfficientNetB0, preprocess_input as efficientnet_preprocess_input, decode_predictions as efficientnet_decode_predictions
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input as resnet_preprocess_input, decode_predictions as resnet_decode_predictions
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input as inception_preprocess_input, decode_predictions as inception_decode_predictions
from tensorflow.keras.applications.mobilenet import MobileNet, preprocess_input as mobilenet_preprocess_input, decode_predictions as mobilenet_decode_predictions
from tensorflow.keras.applications.densenet import DenseNet121, preprocess_input as densenet_preprocess_input, decode_predictions as densenet_decode_predictions
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input as vgg16_preprocess_input, decode_predictions as vgg16_decode_predictions
</code></pre><p>Now getting them all loaded in was fine and using Streamlit to pull together  a UI made it easy. To actually glean some insight from the application though I made some changes. First of all I made it so all the results from each model were placed in the same table, and then I sorted the table by the probability (the confidence.) So whichever model was most confident was at the top with it&apos;s guess. The models aren&apos;t perfect, but at some tasks they excelled.</p><figure class="kg-card kg-image-card kg-width-full"><img src="https://blog.good-spiders.com/content/images/2024/06/image-1.png" class="kg-image" alt="Computer Vision / Streamlit / Memory Management" loading="lazy" width="2000" height="958" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/06/image-1.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/06/image-1.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/06/image-1.png 1600w, https://blog.good-spiders.com/content/images/2024/06/image-1.png 2000w"></figure><p>With a little bit of tweaking and setting some thresholds, I got it to write a poem about the image it saw, that is if at least a few of the models had more than 50% probability in their guesses.</p>]]></content:encoded></item><item><title><![CDATA[Breaking Down AI - Understanding Convolutional Neural Networks]]></title><description><![CDATA[<p>Convolutional Neural Networks (CNNs) have revolutionized the field of computer vision and deep learning, becoming a cornerstone technology for image processing and recognition tasks. In this blog post, we will explore the origins of CNNs, delve into how they work, and discuss why they are so widely used across various</p>]]></description><link>https://blog.good-spiders.com/breaking-down-ai-understanding-convolutional-neural-networks-origins-mechanics-and-applications/</link><guid isPermaLink="false">665dedb951dfd20001358c00</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Mon, 03 Jun 2024 16:29:38 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/06/1680532048475.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/06/1680532048475.jpg" alt="Breaking Down AI - Understanding Convolutional Neural Networks"><p>Convolutional Neural Networks (CNNs) have revolutionized the field of computer vision and deep learning, becoming a cornerstone technology for image processing and recognition tasks. In this blog post, we will explore the origins of CNNs, delve into how they work, and discuss why they are so widely used across various applications.</p><h4 id="origins-of-cnns">Origins of CNNs</h4><p>The concept of Convolutional Neural Networks traces back to the 1980s when <a href="https://en.wikipedia.org/wiki/Kunihiko_Fukushima" rel="noreferrer">Kunihiko Fukushima</a> introduced the &quot;<a href="https://link.springer.com/article/10.1007/BF00344251" rel="noreferrer">Neocognitron</a>,&quot; a model designed for pattern recognition. </p><p>The Neocognitron was inspired by the model proposed by&#xA0;<a href="https://en.wikipedia.org/wiki/David_H._Hubel">Hubel</a>&#xA0;&amp;&#xA0;<a href="https://en.wikipedia.org/wiki/Torsten_Wiesel">Wiesel</a>&#xA0;in 1959. They found two types of cells in the visual primary cortex called&#xA0;<a href="https://en.wikipedia.org/wiki/Simple_cell"><em>simple cell</em></a>&#xA0;and&#xA0;<a href="https://en.wikipedia.org/wiki/Complex_cell"><em>complex cell</em></a>, and also proposed a cascading model of these two types of cells for use in pattern recognition tasks.</p><p>However, it wasn&apos;t until the 1990s that CNNs gained significant traction, largely due to the pioneering work of <a href="https://en.wikipedia.org/wiki/Yann_LeCun" rel="noreferrer">Yann LeCun</a> and his colleagues. They developed the <a href="https://medium.com/@siddheshb008/lenet-5-architecture-explained-3b559cb2d52b" rel="noreferrer">LeNet architecture</a>, which was successfully used for digit recognition in zip codes, showcasing the practical potential of CNNs.</p><h4 id="how-cnns-work">How CNNs Work</h4><p>CNNs are specially designed for processing grid-like data, such as images. Here&apos;s a breakdown of their key components and how they function:</p><ol><li><strong>Convolutional Layers:</strong><br>Convolutional layers apply convolution operations to the input data. A convolution operation involves a filter (or kernel) that slides over the input data, performing element-wise multiplication and summation to produce a feature map. These filters are learnable parameters, enabling the network to identify essential features such as edges, textures, and patterns.</li><li><strong>Pooling Layers:</strong><br>Pooling layers reduce the spatial dimensions of the feature maps, making computations more efficient and reducing the risk of overfitting. Common pooling operations include max pooling (selecting the maximum value within a window) and average pooling (computing the average value within a window).</li><li><strong>Activation Functions:</strong><br>Non-linear activation functions, such as ReLU (Rectified Linear Unit), introduce non-linearity into the model, allowing it to learn more complex patterns.</li><li><strong>Fully Connected Layers:</strong><br>After several convolutional and pooling layers, the feature maps are flattened and fed into fully connected (dense) layers. These layers function like traditional neural networks, where each neuron is connected to every neuron in the previous layer.</li><li><strong>Output Layer:</strong><br>The final layer produces the output, which can be a classification (e.g., softmax for multi-class classification) or regression (e.g., linear activation for continuous values).</li></ol><h4 id="why-cnns-are-used">Why CNNs Are Used</h4><p>CNNs are popular for several reasons:</p><ol><li><strong>Spatial Hierarchy of Features:</strong><br>CNNs effectively capture spatial hierarchies in images, from low-level features (like edges) to high-level features (like objects). This hierarchical learning enables robust feature extraction.</li><li><strong>Parameter Sharing:</strong><br>By using the same filter across different parts of the input, CNNs significantly reduce the number of parameters compared to fully connected networks, making them more efficient and less prone to overfitting.</li><li><strong>Translation Invariance:</strong><br>CNNs are inherently translation-invariant, meaning they can recognize objects regardless of their position in the image. This property is crucial for tasks like object detection and image classification.</li><li><strong>Versatility:</strong><br>While originally designed for image data, CNNs have been successfully applied to various types of data, including time-series data, audio signals, and even text (in the form of character-level or word-level embeddings).</li><li><strong>Performance:</strong><br>CNNs have achieved state-of-the-art performance in many computer vision tasks, including image classification (e.g., AlexNet, VGG, ResNet), object detection (e.g., YOLO, Faster R-CNN), and image segmentation (e.g., U-Net).</li></ol><h4 id="applications-of-cnns">Applications of CNNs</h4><p>CNNs are used in a wide range of applications:</p><ul><li><strong>Image Classification:</strong> Identifying objects within images (e.g., ImageNet competition).</li><li><strong>Object Detection:</strong> Locating and classifying objects within an image (e.g., self-driving cars).</li><li><strong>Image Segmentation:</strong> Dividing an image into segments for detailed analysis (e.g., medical imaging).</li><li><strong>Face Recognition:</strong> Identifying and verifying individuals&apos; faces (e.g., security systems).</li><li><strong>Natural Language Processing:</strong> Applying CNNs to text data for tasks like sentiment analysis and language modeling.</li></ul><h3 id="conclusion">Conclusion</h3><p>Convolutional Neural Networks are a powerful and versatile tool in deep learning, particularly suited for tasks involving visual data and other structured grid-like data. Their ability to automatically and adaptively learn spatial hierarchies of features makes them invaluable for a wide range of applications. As the field of deep learning continues to evolve, CNNs will undoubtedly remain at the forefront of technological advancements.</p>]]></content:encoded></item><item><title><![CDATA[Breaking Down AI - Exploring Image Classification with Neural Networks]]></title><description><![CDATA[<p>When I started this project is was based on a recommendation from a friend. I didn&apos;t know at that time what I was embarking on was a Convolutional Neural Network journey. That is what it ended up being though, to further explore the concept I wrote a companion</p>]]></description><link>https://blog.good-spiders.com/exploring-image-classification-with-neural-networks/</link><guid isPermaLink="false">665de6e751dfd20001358bb8</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Mon, 03 Jun 2024 16:21:19 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/06/empty_07.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/06/empty_07.jpg" alt="Breaking Down AI - Exploring Image Classification with Neural Networks"><p>When I started this project is was based on a recommendation from a friend. I didn&apos;t know at that time what I was embarking on was a Convolutional Neural Network journey. That is what it ended up being though, to further explore the concept I wrote a companion blog post about CNNs to provide a bit more context and to learn more about them myself.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://blog.good-spiders.com/breaking-down-ai-understanding-convolutional-neural-networks-origins-mechanics-and-applications/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Breaking Down AI - Understanding Convolutional Neural Networks</div><div class="kg-bookmark-description">Convolutional Neural Networks (CNNs) have revolutionized the field of computer vision and deep learning, becoming a cornerstone technology for image processing and recognition tasks. In this blog post, we will explore the origins of CNNs, delve into how they work, and discuss why they are so widely used across various</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://blog.good-spiders.com/content/images/size/w256h256/2024/05/good-spiders-logo.png" alt="Breaking Down AI - Exploring Image Classification with Neural Networks"><span class="kg-bookmark-author">Good Spiders Blog</span><span class="kg-bookmark-publisher">Cole Kujawa</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://blog.good-spiders.com/content/images/2024/06/1680532048475.jpg" alt="Breaking Down AI - Exploring Image Classification with Neural Networks"></div></a></figure><p>Full Jupyter Notebook for this post is here:<br><a href="https://github.com/cskujawa/jupyter-notebooks/blob/main/ML/FullBowl/FullBowl.ipynb">https://github.com/cskujawa/jupyter-notebooks/blob/main/ML/FullBowl/FullBowl.ipynb</a></p><h3 id="project-overview">Project Overview</h3><p>The goal of this project was to build a model capable of classifying images of bowls as either full or empty. The project was implemented using Python in a Jupyter Notebook, leveraging the power of deep learning frameworks like Torch. Here&apos;s a step-by-step guide to the process, including detailed notes from the notebook to provide additional context and insights.</p><h3 id="step-1-setting-up-the-environment">Step 1: Setting Up the Environment</h3><p>Before diving into the code, it was essential to set up the environment. This included installing the necessary libraries and frameworks.</p><ul><li><code>torch</code>: Core library.</li><li><code>torch.utils.data.DataLoader</code>: Utility to load data in batches.</li><li><code>torchvision.datasets</code>: Contains many standard vision datasets.</li><li><code>torchvision.transforms</code>: Common image transformations.</li><li><code>datasets.load_dataset</code>: Function to load datasets from Hugging Face.</li><li><code>ViTFeatureExtractor</code>: A feature extractor for ViT models.</li></ul><pre><code class="language-Python">import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import cv2</code></pre><h3 id="step-2-preparing-the-data">Step 2: Preparing the Data</h3><p>The data preparation phase involved loading and preprocessing the images. This step is crucial as it ensures the data is in the right format for training the model. The images were resized, normalized, and augmented to improve the model&apos;s performance and generalization.</p><h4 id="loading-and-preprocessing-images">Loading and Preprocessing Images</h4><pre><code class="language-Python"># Load dataset
dataset = load_dataset(&apos;training&apos;, data_dir=&apos;data&apos;)

# Split the dataset into training and validation sets
train_test_split = dataset[&apos;train&apos;].train_test_split(test_size=0.2)
train_dataset = train_test_split[&apos;train&apos;]
eval_dataset = train_test_split[&apos;test&apos;]

# Transform function
def transform(example_batch):
    # Convert images to RGB and apply feature extraction
    images = [image.convert(&quot;RGB&quot;) for image in example_batch[&apos;image&apos;]]
    inputs = feature_extractor(images, return_tensors=&apos;pt&apos;)
    inputs[&apos;labels&apos;] = example_batch[&apos;label&apos;]
    return inputs

# Apply transformations to datasets
train_dataset.set_transform(transform)
eval_dataset.set_transform(transform)</code></pre><h3 id="step-3-load-the-vit-model">Step 3: Load the ViT model</h3><p>This phase is where we load the pre-trained Vision Transformer (ViT) model and its corresponding feature extractor from Hugging Face. It prepares the model for image classification tasks with the specified number of output labels.</p><p>Libraries:</p><ul><li><code>ViTForImageClassification</code>: Model class for Vision Transformer.</li><li><code>from_pretrained</code>: Loads a pre-trained model from Hugging Face&apos;s model hub.</li><li><code>num_labels</code>: Number of output labels.</li><li><code>ignore_mismatched_sizes</code>: Useful for fine-tuning a model with different input sizes.</li></ul><pre><code class="language-Python">model = ViTForImageClassification.from_pretrained(&apos;google/vit-base-patch16-224&apos;, num_labels=2, ignore_mismatched_sizes=True)

training_args = TrainingArguments(
    output_dir=&apos;./results&apos;,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    evaluation_strategy=&quot;epoch&quot;,
    save_strategy=&quot;epoch&quot;,
    logging_dir=&apos;./logs&apos;,
)</code></pre><h3 id="step-4-training-the-model">Step 4: Training the Model</h3><p>Training the model involved feeding it the prepared images and letting it learn the distinguishing features of full and empty bowls. This was done over multiple epochs, with the model&apos;s performance being evaluated on a validation set.</p><h3 id="set-up-optimizer-and-learning-rate-scheduler">Set up optimizer and learning rate scheduler</h3><p>This block sets up the optimizer and learning rate scheduler for training the model. The optimizer updates the model parameters to minimize the loss function, while the learning rate scheduler adjusts the learning rate during training to improve performance.</p><p>Libraries:</p><ul><li><code>AdamW</code>: Optimizer with weight decay fix, recommended for transformers.</li><li><code>get_scheduler</code>: Utility to get a learning rate scheduler.</li><li><code>tqdm.auto.tqdm</code>: Progress bar library.</li></ul><pre><code class="language-Python">from transformers import AdamW, get_scheduler
from tqdm.auto import tqdm

optimizer = AdamW(model.parameters(), lr=5e-5)

num_epochs = 3
num_training_steps = num_epochs * len(train_loader)
lr_scheduler = get_scheduler(
    name=&quot;linear&quot;, optimizer=optimizer, num_warmup_steps=0, num_training_steps=num_training_steps
)

progress_bar = tqdm(range(num_training_steps))

model.train()
for epoch in range(num_epochs):
    for batch in train_loader:
        # Move batch to device (GPU or CPU)
        batch = {k: v.to(model.device) for k, v in batch.items()}

        # Forward pass
        print(f&quot;batch: {batch}&quot;)
        outputs = model(**batch)
        loss = outputs.loss

        # Backward pass
        loss.backward()

        # Update weights
        optimizer.step()
        lr_scheduler.step()
        optimizer.zero_grad()
        progress_bar.update(1)

        # Print loss for debugging
        progress_bar.set_postfix({&quot;loss&quot;: loss.item()})</code></pre><h3 id="step-5-evaluating-the-model">Step 5: Evaluating the Model</h3><p>After training, ideally the model&apos;s performance would be evaluated on a separate test set, but I was tired of sourcing images for the dataset so I used the existing images. This step would usually provide insights into how well the model generalized to new, unseen data.</p><p>Libraries:</p><ul><li><code>model.eval()</code>: Sets the model to evaluation mode.</li><li><code>torch.no_grad()</code>: Disables gradient calculation for faster evaluation.</li><li><code>batch.items()</code>: Iterates over the items in a batch.</li></ul><pre><code class="language-Python">model.eval()
total_correct = 0
total_samples = 0

for batch in eval_loader:
    with torch.no_grad():
        # Move batch to device (GPU or CPU)
        batch = {k: v.to(model.device) for k, v in batch.items()}

        # Forward pass
        outputs = model(**batch)

        # Access logits
        logits = outputs.logits

        # Get predictions
        predictions = torch.argmax(logits, dim=-1)

        # Get true labels (assuming they are in &apos;labels&apos; key of batch)
        labels = batch[&apos;labels&apos;]

        # Count correct predictions
        correct = (predictions == labels).sum().item()

        # Update totals
        total_correct += correct
        total_samples += labels.size(0)

# Calculate accuracy
accuracy = (total_correct / total_samples) * 100
print(f&quot;Accuracy: {accuracy:.2f}%&quot;)st_generator)
print(f&apos;Test accuracy: {test_accuracy:.2f}&apos;)</code></pre><h3 id="step-6-save-the-model-and-feature-extractor">Step 6: Save the model and feature extractor</h3><p>After training and evaluating I saved the trained model and the feature extractor to disk so that they can be loaded and used later without retraining.</p><p>Libraries:</p><ul><li><code>save_pretrained</code>: Saves the model and feature extractor for later use.</li></ul><pre><code class="language-Python">model.save_pretrained(&apos;models/cat_bowl_model&apos;)
feature_extractor.save_pretrained(&apos;models/cat_bowl_model&apos;)</code></pre><h3 id="step-7-making-predictions">Step 7: Making Predictions</h3><p>Finally, the model was used to make predictions on new images, if only just a few (images 6 and 7 for the full and empty bowls.) This section demonstrates how to load the saved model and feature extractor, preprocess a new image, and make a prediction using the model. It also prints the predicted label for the given image.</p><p>Libraries:</p><ul><li><code>PIL.Image</code>: Python Imaging Library, used to open and manipulate images.</li><li><code>ViTImageProcessor</code>: Processes images for the ViT model.</li><li><code>model(**inputs).logits</code>: Forward pass to get logits (raw model outputs).</li><li><code>logits.argmax(-1)</code>: Gets the index of the highest logit, representing the predicted class.</li></ul><pre><code class="language-Python">from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image

# Load the fine-tuned model and image processor
model = ViTForImageClassification.from_pretrained(&apos;models/cat_bowl_model&apos;)
processor = ViTImageProcessor.from_pretrained(&apos;models/cat_bowl_model&apos;)

# Function to load and preprocess the image
def load_and_preprocess_image(image_path):
    image = Image.open(image_path)
    inputs = processor(images=image, return_tensors=&quot;pt&quot;)
    return inputs

# Function to predict if the bowl is full or empty
def predict(image_path):
    inputs = load_and_preprocess_image(image_path)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}  # Move to device
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predicted_class_idx = logits.argmax(-1).item()
    return predicted_class_idx

# Class labels
class_labels = [&quot;empty&quot;, &quot;full&quot;]

# Main function for CLI
def main():
    image_path = &quot;training/test/empty_01.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_02.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_03.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_04.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_05.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_06.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/empty_07.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_01.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_02.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_03.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_04.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_05.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_06.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)
    image_path = &quot;training/test/full_07.jpg&quot;
    predicted_class_idx = predict(image_path)
    print(f&quot;The bowl in {image_path} is {class_labels[predicted_class_idx]}.&quot;)

if __name__ == &quot;__main__&quot;:
    main()</code></pre><h2 id="conclusion">Conclusion</h2><p>And finally, the results. Different training runs have resulted in varying results. There is always a degree of inaccuracy with this project, and I believe that to be due largely to the limited dataset used for training.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/06/image.png" class="kg-image" alt="Breaking Down AI - Exploring Image Classification with Neural Networks" loading="lazy" width="543" height="371"></figure><p>There is a lot of ways that this project could be continued to further improve it&apos;s accuracy.</p><ol><li><strong>Data Augmentation</strong>: Although some augmentation was applied, further augmenting the dataset with more varied transformations (such as rotations, flips, and color adjustments) could help the model generalize better.</li><li><strong>Increasing Dataset Size</strong>: Collecting more images for training could significantly enhance the model&apos;s ability to learn diverse features, leading to better performance.</li><li><strong>Hyperparameter Tuning</strong>: Experimenting with different hyperparameters, such as learning rates, batch sizes, and optimizer choices, could yield better training outcomes.</li><li><strong>Advanced Architectures</strong>: Implementing more advanced architectures like ResNet, VGG, or EfficientNet, which are designed to handle image classification tasks more effectively, could improve accuracy.</li></ol>]]></content:encoded></item><item><title><![CDATA[Getting Started with Pretrained Models from Hugging Face in Jupyter Lab]]></title><description><![CDATA[<p>Today we&apos;re going to be getting started with pretrained models from Hugging Face. If you&apos;re new to the world of machine learning or just looking to explore how to use powerful pretrained models, you&apos;re in the right place. We&apos;ll be running our</p>]]></description><link>https://blog.good-spiders.com/getting-started-with-huggingface/</link><guid isPermaLink="false">665a0ee3bc67790001e144d3</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 31 May 2024 20:01:45 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/hf-logo-with-title.png" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/05/hf-logo-with-title.png" alt="Getting Started with Pretrained Models from Hugging Face in Jupyter Lab"><p>Today we&apos;re going to be getting started with pretrained models from Hugging Face. If you&apos;re new to the world of machine learning or just looking to explore how to use powerful pretrained models, you&apos;re in the right place. We&apos;ll be running our project in Jupyter Lab, so make sure you have it installed and ready to go, if not you can check out my post on Jupyter Lab <a href="https://blog.good-spiders.com/running-jupyter-labs/" rel="noreferrer">here</a>. Let&apos;s dive in!</p><h2 id="what-is-hugging-face">What is Hugging Face?</h2><p>Hugging Face is a company that has made a significant impact in the field of natural language processing (NLP). They offer a library called <code>transformers</code>, which provides access to a wide variety of pretrained models for tasks like text classification, question answering, translation, and much more. These models are trained on massive datasets and can save you a ton of time and resources.</p><h2 id="installing-the-transformers-library-and-dependencies">Installing the Transformers Library and Dependencies</h2><p>Before we can use Hugging Face&apos;s models, we need to install the <code>transformers</code> library and any necessary dependencies. Open a new terminal in Jupyter Lab and run:</p><pre><code class="language-Python"># Install transformers dependencies
!pip install -qU transformers
!pip install accelerate
# Had to add in the install tensorflow (which should already be installed in this container) with the [and-cuda] to get some extra functions out of it
!pip install tensorflow[and-cuda]
import transformers</code></pre><p>This will install the library and its dependencies, allowing us to access and use the pretrained models.</p><h2 id="signing-in-to-hugging-face">Signing in to Hugging Face</h2><p>Before we can access Hugging Face&apos;s models we&apos;ll need to login. You&apos;ll need your authentication information from your Hugging Face account, I use the authentication token. You can find yours here:<br><a href="https://huggingface.co/settings/tokens">https://huggingface.co/settings/tokens</a></p><pre><code class="language-Python"># Install Huggingface dependency and login using token
!pip install --upgrade huggingface_hub
from huggingface_hub import login
login()</code></pre><h2 id="loading-a-pretrained-model">Loading a Pretrained Model</h2><p>Now, let&apos;s load a pretrained model. For this example, we&apos;ll use a sentiment analysis model to classify text as positive or negative. We&apos;ll use the <code>pipeline</code> API, which makes it easy to use pretrained models for various tasks.</p><p>First, let&apos;s import the necessary libraries and set up our pipeline:</p><pre><code class="language-Python">from transformers import pipeline

# Load a sentiment analysis pipeline
classifier = pipeline(&apos;sentiment-analysis&apos;)</code></pre><p>The <code>pipeline</code> function abstracts away the complexities of loading models and tokenizers. By specifying <code>sentiment-analysis</code>, we&apos;re loading a pipeline tailored for this specific task.</p><h2 id="performing-sentiment-analysis">Performing Sentiment Analysis</h2><p>With our pipeline set up, we can now perform sentiment analysis on some sample text. Let&apos;s see how it works:</p><pre><code class="language-Python"># Define some text to analyze
text = &quot;I absolutely love the new features in the latest update! It&apos;s amazing!&quot;

# Use the classifier to analyze the sentiment
result = classifier(text)

# Print the result
print(result)</code></pre><p>When you run this code, you&apos;ll see the model&apos;s prediction. In this case, it will likely classify the text as positive. The output will look something like this:</p><pre><code class="language-Python">[{&apos;label&apos;: &apos;POSITIVE&apos;, &apos;score&apos;: 0.9998745918273926}]</code></pre><p>The <code>label</code> indicates the predicted sentiment, and the <code>score</code> represents the confidence level of the prediction.</p><h2 id="exploring-more-use-cases">Exploring More Use Cases</h2><p>The <code>transformers</code> library supports a wide range of tasks beyond sentiment analysis. Here are a few examples:</p><p><strong>Text Generation</strong>:</p><pre><code class="language-Python">generator = pipeline(&apos;text-generation&apos;, model=&apos;gpt-2&apos;)
result = generator(&quot;Once upon a time&quot;, max_length=50, num_return_sequences=1)
print(result)</code></pre><p><strong>Question Answering</strong>:</p><pre><code class="language-Python">question_answerer = pipeline(&apos;question-answering&apos;)
context = &quot;Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very close to the Manhattan Bridge.&quot;
question = &quot;Where is Hugging Face based?&quot;
result = question_answerer(question=question, context=context)
print(result)</code></pre><p><strong>Translation</strong>:</p><pre><code class="language-Python">translator = pipeline(&apos;translation_en_to_fr&apos;)
result = translator(&quot;Hello, how are you?&quot;)
print(result)</code></pre><h2 id="conclusion">Conclusion</h2><p>And there you have it! We&apos;ve successfully used a pretrained model from Hugging Face to perform sentiment analysis in Jupyter Lab. Hugging Face&apos;s <code>transformers</code> library makes it incredibly easy to access and use powerful models for a variety of NLP tasks. Whether you&apos;re analyzing sentiment, generating text, answering questions, or translating languages, the possibilities are endless.</p><p>I hope this guide helps you get started with pretrained models from Hugging Face. Happy coding, and feel free to share your experiences and any cool projects you create using these amazing tools!</p><p>Happy coding!</p>]]></content:encoded></item><item><title><![CDATA[Breaking Down AI - Rainfall Prediction Using Machine Learning]]></title><description><![CDATA[<p>AI is all around us now, it&apos;s moving so fast that at times it feels as if I&apos;ll never keep up. I&apos;ll try though. One common trend I&apos;ve picked up on at work and in conversations with friends and family is that</p>]]></description><link>https://blog.good-spiders.com/rainfall-prediction-using-machine-learning/</link><guid isPermaLink="false">6659f226bc67790001e1447f</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 31 May 2024 16:30:23 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/steve-johnson-ZPOoDQc8yMw-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/05/steve-johnson-ZPOoDQc8yMw-unsplash.jpg" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning"><p>AI is all around us now, it&apos;s moving so fast that at times it feels as if I&apos;ll never keep up. I&apos;ll try though. One common trend I&apos;ve picked up on at work and in conversations with friends and family is that AI has become such a broad and mystical term it&apos;s almost meaningless. It could mean anything, but in reality it&apos;s been around us for decades. What is really new is truly advanced chat assistants like ChatGPT, Gemini, and Claude. For my friends family and colleagues I wanted to start a series on breaking down AI into it&apos;s constituent parts, the process of preparing data, selecting algorithms, training, evaluating, and many of the other steps that go into making these various AI services. So I&apos;m starting this series where I will take a look at a single use case involving data science with a singular goal in mind. These will be like micro projects which may or may not work as intended, but will hopefully shed some light on the different ways AI is implemented.</p><p>Trying to explain every facet of the different processes involved in data science will be out of scope for this post, but I will try to explain what I understand and what I can as I go along.</p><p>For each blog post in this series there will be a corresponding Jupyter Labs notebook available, the notebook and dataset used for this post are here:<br><a href="https://github.com/cskujawa/jupyter-notebooks/tree/main/ML/RainfallPredictions">https://github.com/cskujawa/jupyter-notebooks/tree/main/ML/RainfallPredictions</a></p><p>Before diving into it, I want to talk a little about how AI is involved in predicting weather. Companies like the National Oceanic and Atmospheric Administration (NOAA) utilize sophisticated technologies and methodologies to make weather predictions. Some of which include machine learning like we will use in this blog post.</p><blockquote><strong>Neural Networks and Regression Models:</strong> Enhance specific aspects of forecasting, like precipitation or storm intensity predictions. -<a href="https://www.noaa.gov/ai/about">https://www.noaa.gov/ai/about</a></blockquote><p>So, today I&#x2019;m excited to share a project where I predicted daily rainfall (albeit inaccurately) using historical data and machine learning. This journey takes us through data preparation, visualization, model selection, training, and evaluation.</p><h2 id="preparing-the-data">Preparing the Data</h2><p>First, I loaded the dataset and began the process of cleaning it up. This involved checking for missing values, outliers, and any inconsistencies that might skew the results.</p><pre><code class="language-Python"># Loading the Dataset
import pandas as pd

# Assuming the dataset is in a CSV file named &apos;rainfall_data.csv&apos;
df = pd.read_csv(&apos;rainfall_data.csv&apos;)

# Display the first few rows of the dataframe
df.head()</code></pre><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-15.png" class="kg-image" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning" loading="lazy" width="882" height="301" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-15.png 600w, https://blog.good-spiders.com/content/images/2024/05/image-15.png 882w" sizes="(min-width: 720px) 720px"></figure><h3 id="cleaning-the-data">Cleaning the Data</h3><p>I checked for missing values and filled them using forward fill to ensure there were no gaps that could affect the analysis.</p><pre><code class="language-Python"># Checking for missing values
df.isnull().sum()

# Fill or drop missing values if necessary
df = df.ffill()

# Alternatively, you can use backward fill
# df = df.bfill()</code></pre><h3 id="trimming-unnecessary-columns">Trimming Unnecessary Columns</h3><p>Next, I removed any columns that weren&#x2019;t relevant to the analysis, keeping only what was necessary.</p><pre><code class="language-Python"># Keeping only relevant columns
# Assuming &apos;Date&apos; and &apos;PRCP&apos; are the relevant columns
df = df[[&apos;DATE&apos;, &apos;PRCP&apos;]]

# Renaming the &apos;PRCP&apos; column to &apos;Rainfall&apos;
df = df.rename(columns={&apos;PRCP&apos;: &apos;Rainfall&apos;})

# Display the first few rows of the cleaned dataframe
df.head()</code></pre><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-16.png" class="kg-image" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning" loading="lazy" width="259" height="296"></figure><h3 id="exploring-the-data">Exploring the Data</h3><p>To understand the data better, I explored its structure and distribution.</p><pre><code class="language-Python"># Summary statistics
df.describe()

# Checking the data types
df.info()</code></pre><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-17.png" class="kg-image" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning" loading="lazy" width="473" height="266"></figure><p>I had to do some additional cleaning to remove gaps in the timeline where there was no data, and I eventually ended up with this dataset.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-20.png" class="kg-image" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning" loading="lazy" width="1645" height="891" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-20.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-20.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-20.png 1600w, https://blog.good-spiders.com/content/images/2024/05/image-20.png 1645w" sizes="(min-width: 720px) 720px"></figure><h3 id="engineering-features">Engineering Features</h3><p>I created new features from the existing data to make it more informative for the model.</p><pre><code class="language-Python"># Example: Extracting date-related features
df[&apos;DATE&apos;] = pd.to_datetime(df[&apos;DATE&apos;])
df[&apos;Year&apos;] = df[&apos;DATE&apos;].dt.year
df[&apos;Month&apos;] = df[&apos;DATE&apos;].dt.month
df[&apos;Day&apos;] = df[&apos;DATE&apos;].dt.day</code></pre><h3 id="splitting-the-data">Splitting the Data</h3><p>I split the dataset into training and testing sets to evaluate the model&#x2019;s performance accurately.</p><pre><code class="language-Python">from sklearn.model_selection import train_test_split

# Define features and target variable
X = df[[&apos;Year&apos;, &apos;Month&apos;, &apos;Day&apos;]]
y = df[&apos;Rainfall&apos;]

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)</code></pre><h2 id="choosing-and-training-models">Choosing and Training Models</h2><p>We will select a suitable machine learning model, train it on the training data, and evaluate its performance on the test data.</p><h3 id="linear-regression-model">Linear Regression Model</h3><blockquote><br>A linear regression model is a statistical method used to model the relationship between a dependent variable and one or more independent variables. It assumes a linear relationship between the input variables (independent) and the output variable (dependent). The model attempts to find the best-fitting straight line (the regression line) that minimizes the difference (error) between the observed data points and the predicted values. This line is represented by the equation &#x1D466;=&#x1D6FD;0+&#x1D6FD;1&#x1D465;+&#x1D716;<em>y</em>=<em>&#x3B2;</em>0&#x200B;+<em>&#x3B2;</em>1&#x200B;<em>x</em>+<em>&#x3F5;</em>, where &#x1D466;<em>y</em> is the dependent variable, &#x1D465;<em>x</em> is the independent variable, &#x1D6FD;0<em>&#x3B2;</em>0&#x200B; is the y-intercept, &#x1D6FD;1<em>&#x3B2;</em>1&#x200B; is the slope of the line, and &#x1D716;<em>&#x3F5;</em> is the error term. Linear regression is widely used for prediction and forecasting in various fields such as economics, biology, engineering, and social sciences.</blockquote><p>I started with a simple Linear Regression model to get a baseline performance.</p><pre><code class="language-Python">from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Initialize the model
lr_model = LinearRegression()

# Train the model
lr_model.fit(X_train, y_train)

# Predictions
y_pred_lr = lr_model.predict(X_test)

# Evaluation
mae_lr = mean_absolute_error(y_test, y_pred_lr)
mse_lr = mean_squared_error(y_test, y_pred_lr)</code></pre><h3 id="random-forest-model">Random Forest Model</h3><blockquote>A Random Forest model is an ensemble learning method used for classification and regression tasks. It constructs multiple decision trees during training and merges their results to produce a more accurate and stable prediction. Each tree in the forest is built using a random subset of the data and a random subset of the features, which helps to reduce overfitting and improve generalization. The final prediction is obtained by averaging the predictions of all the trees (for regression) or by majority voting (for classification). Random Forest models are known for their high accuracy, robustness to noise, and ability to handle large datasets with many features. They are widely used in various domains, including finance, healthcare, and image recognition.</blockquote><p>Next, I tried a Random Forest model, known for its ability to handle more complex data patterns.</p><pre><code class="language-Python">from sklearn.ensemble import RandomForestRegressor

# Initialize the model
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)

# Train the model
rf_model.fit(X_train, y_train)

# Predictions
y_pred_rf = rf_model.predict(X_test)

# Evaluation
mae_rf = mean_absolute_error(y_test, y_pred_rf)
mse_rf = mean_squared_error(y_test, y_pred_rf)</code></pre><h2 id="evaluating-the-models">Evaluating the Models</h2><p>I compared the performance of both models using mean absolute error (MAE) and mean squared error (MSE).</p><pre><code class="language-Python"># Model evaluation results
evaluation_results = {
    &apos;Model&apos;: [&apos;Linear Regression&apos;, &apos;Random Forest&apos;],
    &apos;MAE&apos;: [mae_lr, mae_rf],
    &apos;MSE&apos;: [mse_lr, mse_rf]
}

# Display the evaluation results
pd.DataFrame(evaluation_results)</code></pre><p>We compared the performance of two machine learning models: Linear Regression and Random Forest. The evaluation results for Mean Squared Error (MSE) were:</p><ul><li><strong>Linear Regression</strong>: 0.1007</li><li><strong>Random Forest</strong>: 0.0901</li></ul><p>The lower MSE value for the Random Forest model indicates that it performs better in predicting daily rainfall compared to the Linear Regression model.</p><h2 id="visualizing-the-results">Visualizing the Results</h2><p>To see how well the models performed, I visualized the actual vs predicted rainfall values.</p><pre><code class="language-Python">import matplotlib.pyplot as plt

# Plot actual vs predicted values for Linear Regression
plt.figure(figsize=(10, 5))
plt.plot(y_test.values, label=&apos;Actual&apos;)
plt.plot(y_pred_lr, label=&apos;Predicted - Linear Regression&apos;)
plt.legend()
plt.title(&apos;Actual vs Predicted Rainfall (Linear Regression)&apos;)
plt.show()

# Plot actual vs predicted values for Random Forest
plt.figure(figsize=(10, 5))
plt.plot(y_test.values, label=&apos;Actual&apos;)
plt.plot(y_pred_rf, label=&apos;Predicted - Random Forest&apos;)
plt.legend()
plt.title(&apos;Actual vs Predicted Rainfall (Random Forest)&apos;)
plt.show()</code></pre><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-21.png" class="kg-image" alt="Breaking Down AI - Rainfall Prediction Using Machine Learning" loading="lazy" width="1661" height="891" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-21.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-21.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-21.png 1600w, https://blog.good-spiders.com/content/images/2024/05/image-21.png 1661w" sizes="(min-width: 720px) 720px"></figure><h2 id="reflections-and-future-directions">Reflections and Future Directions</h2><h3 id="model-performance">Model Performance</h3><p>The Random Forest model outperformed the Linear Regression model. It captured the trends and patterns in the data more effectively, as seen in the visualizations where predicted values closely followed the actual values.</p><h3 id="key-takeaways">Key Takeaways</h3><ol><li><strong>Model Selection</strong>: The Random Forest model is better suited for predicting daily rainfall in this dataset.</li><li><strong>Strengths</strong>: Its ability to handle non-linear relationships and interactions within the data.</li><li><strong>Areas for Improvement</strong>: Hyperparameter tuning, advanced models, and incorporating additional features like temperature and humidity.</li></ol><h3 id="future-work">Future Work</h3><p>To further improve the model:</p><ol><li><strong>Hyperparameter Tuning</strong>: Fine-tune the Random Forest parameters.</li><li><strong>Feature Engineering</strong>: Explore additional features such as weather conditions and seasonal indicators.</li><li><strong>Advanced Models</strong>: Experiment with LSTM or other time-series forecasting methods.</li><li><strong>Data Quality</strong>: Ensure a comprehensive and continuous dataset, possibly integrating more data sources.</li></ol><h3 id="final-thoughts">Final Thoughts</h3><p>This project highlighted the importance of choosing the right model and evaluating it thoroughly. The Random Forest model proved to be a reliable tool for predicting daily rainfall, but there is a lot of room for improvement.</p><p>By continuously refining the model and incorporating new data and techniques, we can enhance its accuracy and utility in real-world applications.</p><p>Thanks for following along on this journey. Stay tuned for more explorations into the world of data science and machine learning!</p>]]></content:encoded></item><item><title><![CDATA[Setting Up and Using Jupyter Labs with Docker Compose]]></title><description><![CDATA[<h3 id="introduction">Introduction</h3><p><a href="https://jupyter.org/" rel="noreferrer">Jupyter Lab</a> is an essential tool for data scientists, providing an interactive environment for running notebooks, code, and data visualizations. Setting it up with Docker Compose ensures a consistent, isolated environment, leveraging GPU resources for enhanced performance.</p><p>Note that using the GPU version of Jupyter labs is not required,</p>]]></description><link>https://blog.good-spiders.com/running-jupyter-labs/</link><guid isPermaLink="false">665941f9db237400017b68bc</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 31 May 2024 15:40:03 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/Jupyter_logo.svg.png" medium="image"/><content:encoded><![CDATA[<h3 id="introduction">Introduction</h3><img src="https://blog.good-spiders.com/content/images/2024/05/Jupyter_logo.svg.png" alt="Setting Up and Using Jupyter Labs with Docker Compose"><p><a href="https://jupyter.org/" rel="noreferrer">Jupyter Lab</a> is an essential tool for data scientists, providing an interactive environment for running notebooks, code, and data visualizations. Setting it up with Docker Compose ensures a consistent, isolated environment, leveraging GPU resources for enhanced performance.</p><p>Note that using the GPU version of Jupyter labs is not required, but I am using it for running ML programs and processing on CPU would be far slower.</p><h3 id="why-jupyter-labs">Why Jupyter Labs?</h3><p><strong>1. Rapid Iteration</strong>:<br>Jupyter Lab enables quick prototyping and experimentation. By providing an interactive coding environment, you can write and test code snippets, visualize data, and see results in real-time, making it ideal for data science workflows.</p><p><strong>2. Interactive Visualizations</strong>:<br>With built-in support for rich media outputs, Jupyter Lab is perfect for data visualization. It supports libraries like Matplotlib, Seaborn, and Plotly, allowing you to create and display charts, graphs, and other visualizations directly in your notebook.</p><p><strong>3. Ease of Use</strong>:<br>The intuitive interface of Jupyter Lab makes it accessible to both beginners and experienced users. It supports multiple programming languages via kernels, though Python is the most commonly used.</p><p><strong>4. Enhanced Collaboration</strong>:<br>Jupyter Lab&apos;s ability to share notebooks and work collaboratively with others is a significant advantage. Notebooks can be exported in various formats, including HTML and PDF, and shared with colleagues for review and collaboration.</p><p>You can find my Jupyter Notebooks here: <a href="https://github.com/cskujawa/jupyter-notebooks/tree/main">https://github.com/cskujawa/jupyter-notebooks/tree/main</a></p><p><strong>5. GPU Acceleration</strong>:<br>Leveraging GPUs in data science tasks can significantly speed up computations, especially for machine learning and deep learning applications. The Docker setup ensures that your Jupyter Lab environment is optimized for GPU usage, providing a performance boost for intensive tasks.</p><h3 id="docker-compose-definition">Docker Compose Definition</h3><p>Here&apos;s the Docker Compose configuration I used for setting up my instance of Jupyter Lab:</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-13.png" class="kg-image" alt="Setting Up and Using Jupyter Labs with Docker Compose" loading="lazy" width="935" height="754" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-13.png 600w, https://blog.good-spiders.com/content/images/2024/05/image-13.png 935w" sizes="(min-width: 720px) 720px"></figure><h3 id="explanation">Explanation</h3><ol><li><strong>Container Configuration</strong>:<ul><li><strong>Image</strong>: Uses <code>cschranz/gpu-jupyter:v1.6_cuda-12.0_ubuntu-22.04</code>, which is tailored for GPU support with CUDA 12.0 and Ubuntu 22.04.</li></ul></li><li><strong>Environment Variables</strong>:<ul><li><strong>JUPYTER_ENABLE_LAB</strong>: Enables the Jupyter Lab interface.</li><li><strong>NVIDIA_DRIVER_CAPABILITIES</strong> and <strong>NVIDIA_VISIBLE_DEVICES</strong>: Required for GPU support.</li><li><strong>PASSWORD</strong>: Sets the password for Jupyter Lab access.</li></ul></li><li><strong>Volumes</strong>:<ul><li>Maps the host directory <code>./data/jupyter/</code> to the container&apos;s <code>/home/jovyan/</code> to persist notebooks and data.</li></ul></li><li><strong>Command</strong>:<ul><li>Runs <code>start-notebook.py</code> to initiate the Jupyter Lab server.</li></ul></li><li><strong>Devices</strong>:<ul><li>Maps <code>/dev/dri</code> to allow GPU access within the container.</li></ul></li><li><strong>Deployment Resources</strong>:<ul><li>Reserves GPU resources by specifying the NVIDIA driver, device IDs, and capabilities.</li></ul></li></ol><p>A guide for using Jupyter labs is out of scope for this blog post, but Jupyter has great documentation available here:<br><a href="https://jupyter-notebook.readthedocs.io/en/stable/">https://jupyter-notebook.readthedocs.io/en/stable/</a></p><p>I&apos;ve been working in my Jupyter Lab for a few months now and it has been a fantastic experience. I really wanted to set up the Jupyter Lab so I could test HuggingFace models out, try different projects, and not have to worry about building a whole app every time.</p><p>I&apos;ll be writing some future blog posts on the different projects I&apos;ve tackled using it.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image-14-1.png" class="kg-image" alt="Setting Up and Using Jupyter Labs with Docker Compose" loading="lazy" width="2000" height="1029" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-14-1.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-14-1.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-14-1.png 1600w, https://blog.good-spiders.com/content/images/size/w2400/2024/05/image-14-1.png 2400w" sizes="(min-width: 1200px) 1200px"></figure><h3 id="future-topics">Future Topics</h3><p>Stay tuned for future blog posts where we will use Jupyter Labs to delve into:</p><ul><li><strong>Generative LLMs</strong>: Exploring generative language models.</li><li><strong>Machine Learning</strong>: Notebooks on predicting using machine learning techniques.</li><li><strong>Sentiment Analysis</strong>: Projects on analyzing sentiment in text data.</li><li><strong>Text Summarization</strong>: Notebooks for summarizing text.</li><li><strong>Transcription</strong>: Projects related to transcribing audio data.</li></ul><h3 id="conclusion">Conclusion</h3><p>Setting up Jupyter Lab with Docker Compose streamlines the deployment process, especially when leveraging GPU resources. This setup ensures a consistent, isolated environment, enhancing reproducibility and performance for data science tasks. Whether you&apos;re prototyping machine learning models or visualizing complex datasets, Jupyter Lab offers a powerful and flexible platform to support your work. Happy coding!</p>]]></content:encoded></item><item><title><![CDATA[Home Server Mk. 2]]></title><description><![CDATA[<p>With everything I learned in the development of Mk. 1 I had a better idea of what Mk. 2 was going to be.</p><p>I knew it was all going to be built on the back of <a href="https://blog.good-spiders.comt/post/6634e246cf2b380001f21d6a" rel="noreferrer">Docker Compose</a>, after completely borking up my bare metal server dozens of times, I</p>]]></description><link>https://blog.good-spiders.com/home-server-mk-2/</link><guid isPermaLink="false">6634fea6cf2b380001f21e35</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 03 May 2024 16:27:20 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/farai-gandiya-L6_BBAyWwnw-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/05/farai-gandiya-L6_BBAyWwnw-unsplash.jpg" alt="Home Server Mk. 2"><p>With everything I learned in the development of Mk. 1 I had a better idea of what Mk. 2 was going to be.</p><p>I knew it was all going to be built on the back of <a href="https://blog.good-spiders.comt/post/6634e246cf2b380001f21d6a" rel="noreferrer">Docker Compose</a>, after completely borking up my bare metal server dozens of times, I knew I wanted as little dependencies installed locally as was physically possible. Almost every service I have running on Mk. 2 came as a pre-packaged container, with a few built from scratch API servers built on boiler plate containers.</p><p>The screenshot below is of my <a href="https://hub.docker.com/r/linuxserver/heimdall/" rel="noreferrer">Heimdall</a> container it&apos;s like a web launcher for all of my other home server apps. Heimdall is just one of the many completely free, open-source containers <a href="https://www.linuxserver.io/" rel="noreferrer">LinuxServer.io</a> offers. I cannot praise them enough, their containers are phenomenal, well documented, and almost all plug-n-play. I&apos;m not affiliated with them, but I&apos;ve used a dozen of their containers and love them all. I&apos;m not going to discuss every container seen in the screenshot, but I will cover what is essentially the backbone of Mk. 2.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="2000" height="1027" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image.png 1600w, https://blog.good-spiders.com/content/images/2024/05/image.png 2000w" sizes="(min-width: 1200px) 1200px"></figure><p>To get started, it&apos;s important to note that when I refer to a container, it is generally defined by a few lines of code. The screenshot below is a fairly average length definition for a container in Docker Compose. It has a name, an image (the pre-packaged container,) some settings, and environment variables. The container below updates my CloudFlare DNS account when my IP changes.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-5.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="1017" height="270" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-5.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-5.png 1000w, https://blog.good-spiders.com/content/images/2024/05/image-5.png 1017w" sizes="(min-width: 720px) 720px"></figure><h2 id="files-file-management-with-docker-compose"><strong>Files &amp; File Management with Docker Compose</strong></h2><p>I&apos;ve pared down my file structure to be as simple and quick to navigate as possible. If there is any configuration files, or if I want to manipulate files inside of a container, it gets it&apos;s own little directory in the data directory.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-6.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="337" height="823"></figure><p>If I want to give a container access to system files it gets a volume where the left side of the definition is the directory on the system and the right is the directory on the container. This effectively creates a live connection between the two. The below definition gives my Netdata container Read Only (:ro) access to the system files to monitor performance.</p><pre><code>volumes:
    - /proc:/host/proc:ro
    - /sys:/host/sys:ro
</code></pre>
<p>Linking files and directories in containers to the local system makes it really easy to see what&apos;s going on, you can be as restrictive or permissive as is needed. Usually I use it for logs/configs as seen in the definition below.</p><pre><code>volumes:
    - ./data/pt_wings/config:/etc/pterodactyl/
    - ./data/pt_wings/logs:/var/log/pterodactyl/
</code></pre>
<h2 id="services-i-use-nearly-every-day"><strong>Services I Use Nearly Every Day</strong></h2><p>Some of the services I have running on Mk. 2 are set it and forget it, some are tools I use nearly every day. Some of those are <a href="https://hub.docker.com/r/portainer/portainer-ce" rel="noreferrer">Portainer</a>, <a href="https://hub.docker.com/r/amir20/dozzle" rel="noreferrer">Dozzle</a>, and <a href="https://hub.docker.com/r/netdata/netdata" rel="noreferrer">Netdata</a>. My Docker Compose definitions are almost identical to those found in their respective Docker Hub pages. </p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-7.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="1437" height="1725" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-7.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-7.png 1000w, https://blog.good-spiders.com/content/images/2024/05/image-7.png 1437w" sizes="(min-width: 720px) 720px"></figure><p><strong>Portainer</strong>:<br>I don&apos;t use Portainer to create containers even though it can do that, I still write my docker-compose.yaml files by hand and usually start containers initially via CLI. That being said, it is phenomenal for getting a high level view of different projects I have going on, each docker-compose.yaml represents a &quot;Stack&quot; in Portainer.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image-8.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="2000" height="619" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-8.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-8.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-8.png 1600w, https://blog.good-spiders.com/content/images/size/w2400/2024/05/image-8.png 2400w" sizes="(min-width: 1200px) 1200px"></figure><p>When viewing a stack it gives you an overview of each container in that stack. You can easily see at a glance their name, state, ports, and you can start/stop/restart, basically most of the high use commands in the docker compose package.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image-9.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="2000" height="875" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-9.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-9.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-9.png 1600w, https://blog.good-spiders.com/content/images/size/w2400/2024/05/image-9.png 2400w" sizes="(min-width: 1200px) 1200px"></figure><p><strong>Dozzle:</strong><br>Dozzle is great for quickly viewing logs and Memory/CPU usage per container, Portainer also has log viewing capability but it&apos;s less feature rich than Dozzle. You can see Dozzle has a very similar stack/container layout in it&apos;s navigation as well.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image-10.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="2000" height="1024" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-10.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-10.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-10.png 1600w, https://blog.good-spiders.com/content/images/size/w2400/2024/05/image-10.png 2400w" sizes="(min-width: 1200px) 1200px"></figure><p><strong>Netdata:</strong><br>Now, if you read my post on Mk. 1 then you know I built a Laravel application leveraging CAdvisor, Prometheus, and Grafana to create a system monitoring page. That was months of development. Well I replaced all of that with Netdata in about 5 minutes. It let&apos;s you view basically anything going on in the system, the free version has more information available than I&apos;ll ever use. It&apos;s great to see what is going on with the bare metal server itself.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.good-spiders.com/content/images/2024/05/image-3.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="2000" height="1026" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-3.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-3.png 1000w, https://blog.good-spiders.com/content/images/size/w1600/2024/05/image-3.png 1600w, https://blog.good-spiders.com/content/images/2024/05/image-3.png 2000w" sizes="(min-width: 1200px) 1200px"></figure><h2 id="other-important-services">Other Important Services</h2><p>There&apos;s a lot more services that go into running Mk. 2. I&apos;ll cover most of them briefly here.</p><p><strong>Docker Socket Proxy:</strong><br>This is a super simple container that replaces any lines where you would otherwise need to allow a container (like Portainer) to access the Docker Socket which is used for starting/stopping/other Docker operations. It acts as a security gateway/proxy for those requests.</p><p><strong>NGINX Proxy Manager:</strong><br>I could rave about this for days, but it will require it&apos;s own article along with other advanced networking topics. For now, suffice to say it makes managing reverse proxies, caching, SSL, and HTTPS routing incredibly simple.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-11.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="794" height="960" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-11.png 600w, https://blog.good-spiders.com/content/images/2024/05/image-11.png 794w" sizes="(min-width: 720px) 720px"></figure><p>These are some services I basically forget exist most of the time.</p><p><strong>DockerGC:</strong><br>This just frees up resources that are left dangling in Docker environments</p><p><strong>WatchTower:</strong><br>This is great as it keeps the docker containers up to date automatically, although sometimes you may not want that.</p><figure class="kg-card kg-image-card"><img src="https://blog.good-spiders.com/content/images/2024/05/image-12.png" class="kg-image" alt="Home Server Mk. 2" loading="lazy" width="1069" height="990" srcset="https://blog.good-spiders.com/content/images/size/w600/2024/05/image-12.png 600w, https://blog.good-spiders.com/content/images/size/w1000/2024/05/image-12.png 1000w, https://blog.good-spiders.com/content/images/2024/05/image-12.png 1069w" sizes="(min-width: 720px) 720px"></figure><h2 id="conclusion">Conclusion</h2><p>That about wraps it up, those were some of the first services I got running on my home network, and they laid the ground work for dozens of projects to follow. They make managing my projects super simple.</p>]]></content:encoded></item><item><title><![CDATA[Home Server Mk. 1]]></title><description><![CDATA[<p>Home server, home lab, tinker station, I don&apos;t know what to call the Frankenstein of a computer I&apos;ve built, but it&apos;s far and away my favorite thing to play with. This is the story of how it came to be.</p><p>It started with a</p>]]></description><link>https://blog.good-spiders.com/home-server/</link><guid isPermaLink="false">6634f038cf2b380001f21dbb</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 03 May 2024 15:09:14 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/patrik-kernstock-8yN3T4XDJ70-unsplash.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/05/patrik-kernstock-8yN3T4XDJ70-unsplash.jpg" alt="Home Server Mk. 1"><p>Home server, home lab, tinker station, I don&apos;t know what to call the Frankenstein of a computer I&apos;ve built, but it&apos;s far and away my favorite thing to play with. This is the story of how it came to be.</p><p>It started with a computer case and motherboard my former employer was dumping, it was a workstation at one point in time. I bought a used server motherboard, Xeon processor, ECC RAM, and for kicks I stuck my old GeForce 1080 GPU in it. That sentence makes it sound like it happened over night, but it took well over a year to get all of the hardware assembled into what it is today. This blog is hosted on that Frankenstein.</p><p>Now, the question remains, why? I got an A.S. in Computer Science and continued on to a B.S. in Information Technology. I thoroughly enjoyed most of the development I did, but I generally lack direction. I wanted to have a place where I could try and develop different applications, quickly iterate over designs, prototype, and most importantly scrap junk easily when needed. So I decided I&apos;d scrap together my home server.</p><p>I started by building little bash scripts for a former employer, eventually I was developing whole pages and programs in PHP. That got old real fast, as it turns out I have a strong distaste for front end development. <a href="https://laravel.com/" rel="noreferrer">Laravel</a> helped ease that pain, some. What it really did was introduce me to frameworks, my web development teacher always told me &quot;you&apos;re going to need to know how to write HTML from scratch,&quot; == false, it never hurt though.</p><p>The way that Laravel loads it&apos;s parts was fascinating to me, you could edit a page and then refresh the browser and see the changes immediately, it was like magic. I wanted to build my own Laravel application, I had no idea what I was going to do with it, but I was going to do it. I heard another buzz word at that time and thought, I&apos;ll <em>containerize </em>it too!</p><p>So without a real plan I started by following this lovely guide from Digital Ocean:<br><a href="https://www.digitalocean.com/community/tutorials/how-to-install-and-set-up-laravel-with-docker-compose-on-ubuntu-22-04">https://www.digitalocean.com/community/tutorials/how-to-install-and-set-up-laravel-with-docker-compose-on-ubuntu-22-04</a></p><p>It was all going fairly well, learning how to prepare an application to be containerized, how to get them to communicate, but when it came time to make it available to external connections I was introduced to the boss fight. NGINX, which having never heard of it before, only read it, I pronounced it en&#x2013;jinx. It was only a little embarrassing this first time a friend corrected me on that pronunciation. Learning NGINX was no joke, it was the better part of 6 months of debugging every time I needed to expose a new service. I learned all about reverse proxying, configuring HTTPS, caching, and so much more. </p><p>Once I finally had a stable application, accessible from the web, it was time to do something with it. I didn&apos;t really have any idea where I was going to start, or what I was going to do. I was worried about resource utilization, this was my first dedicated server after all and I wanted a web interface to monitor it. So I set out to learn how to monitor a system. I learned all about <a href="https://github.com/google/cadvisor" rel="noreferrer">CAdvisor</a>, <a href="https://prometheus.io/" rel="noreferrer">Prometheus</a>, and <a href="https://grafana.com/" rel="noreferrer">Grafana</a>. By their powers combined I had put together a monitoring page. This would be the foundation for several projects that would be built on the back of the monolithic app that would be referred to as <a href="https://github.com/cskujawa/jarvis-ai/tree/main" rel="noreferrer">J.A.R.V.I.S</a>. Just A Really Versatile Information System.</p><figure class="kg-card kg-image-card"><img src="https://github.com/cskujawa/jarvis-ai/blob/main/interface/laravel/public/image/app.png?raw=true" class="kg-image" alt="Home Server Mk. 1" loading="lazy" width="1920" height="949"></figure><p>Now in many of my positions I have/or currently function as a subject matter expert, and that means being able to explain to others what something is, how it works, etc. So I wrote <a href="https://github.com/cskujawa/jarvis-ai/tree/main/docs" rel="noreferrer">guides</a> for setting up that project using WSL and on a bare metal server (Ubuntu.) I would not recommend using either of those guides or that project, it was my Home Server Mk. 1. Full of flaws, bad logic, half finished ideas, etc.</p><p>All of the skills I learned along the way prepared me for the adventure that would be my Home Server Mk. 2.</p>]]></content:encoded></item><item><title><![CDATA[Unleashing the Benefits of Docker Compose on Bare Metal Servers]]></title><description><![CDATA[<p>In the world of software deployment, Docker has revolutionized how applications are deployed and managed across different environments. Docker Compose, a tool for defining and running multi-container Docker applications, offers an added layer of convenience and efficiency, particularly when used on bare metal servers. This blog post explores the significant</p>]]></description><link>https://blog.good-spiders.com/unleashing-the-benefits-of-docker-compose-on-bare-metal-servers/</link><guid isPermaLink="false">6634e246cf2b380001f21d6a</guid><dc:creator><![CDATA[Cole Kujawa]]></dc:creator><pubDate>Fri, 03 May 2024 14:07:32 GMT</pubDate><media:content url="https://blog.good-spiders.com/content/images/2024/05/MainImage-2.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://blog.good-spiders.com/content/images/2024/05/MainImage-2.jpeg" alt="Unleashing the Benefits of Docker Compose on Bare Metal Servers"><p>In the world of software deployment, Docker has revolutionized how applications are deployed and managed across different environments. Docker Compose, a tool for defining and running multi-container Docker applications, offers an added layer of convenience and efficiency, particularly when used on bare metal servers. This blog post explores the significant benefits of using Docker Compose on a bare metal server, from improved resource utilization to simplified operational procedures.</p><p><strong>Enhanced Performance and Resource Utilization</strong>:<br>One of the standout benefits of using Docker Compose on a bare metal server is the direct access to hardware resources, which translates into enhanced performance. In my home lab I can add about 10 lines to any container definition to give the container access to my GPU. A few lines can define where and how a container stores data, static IPs, subnets, resource restrictions, etc.</p><p><strong>Simplified Management and Scalability</strong>:<br>Docker Compose simplifies the management of container-based applications. By defining your multi-container setup in a single YAML file, you can manage the entire lifecycle of your application stack with simple commands. This not only makes setup and teardown incredibly efficient but also ensures consistency across different development, testing, and production environments. I&apos;ve completely wiped and reinstalled the OS on my bare metal server and had my entire home lab back up and running in 10 minutes. It&apos;s as simple as cloning the project definition and any persistent data files and then starting the project.</p><p><strong>Consistency Across Environments</strong>:<br>Using Docker Compose on bare metal servers can significantly reduce the &quot;it works on my machine&quot; syndrome. The containers encapsulate all dependencies, ensuring that the application runs the same way, regardless of where it is deployed. This consistency is crucial for reducing bugs and errors that typically arise from environmental discrepancies during deployment.</p><p><strong>Isolation and Security</strong>:<br>Each container managed by Docker Compose runs in isolation, sharing only the kernel and essential resources. This isolation helps in minimizing conflict between running applications and enhances security by limiting the surface area for potential attacks. Regular updates and easy rollback features further bolster security and application stability.</p><p><strong>Some Great Docker Compose Features:</strong></p><p>Want your frontend to wait for the database to load before starting, it&apos;s just two lines</p>
<pre><code>depends_on:
    - database
</code></pre>
<p>This can be expanded to include a health check operation as well</p>
<pre><code>depends_on:
    database:
        condition: service_healthy
</code></pre>
<p>Most services have health checks pre-written somewhere on the internet and can be used easily, for instance this is one I use for MariaDB which I snagged from somewhere.</p>
<pre><code>healthcheck:
    interval: 30s
    retries: 3
    test:
        [
          &quot;CMD&quot;,
          &quot;healthcheck.sh&quot;,
          &quot;--su-mysql&quot;,
          &quot;--connect&quot;,
          &quot;--innodb_initialized&quot;
        ]
    timeout: 30s
</code></pre>
<p>Creating persistent volumes allows you to completely nuke a container and restart with all of it&apos;s stored file intact</p>
<pre><code>volumes:
  # MySQL Database
  db-data: {}
</code></pre>
<p>Proxying ports is super simple, these two port numbers represent one inside the container (right) and the one that will be exposed on the bare metal server (left.) This makes it possible to run any number of applications that share the same default port, and you only have to change 1 number in the docker-compose.yaml.</p>
<pre><code>ports:
    - 8081:8080 #Proxy port to 8080 to 8081 on host 
</code></pre>
<p>Environment variables allow you to securely define protected information and access it in code as a variable name</p>
<pre><code>environment:
    MYSQL_DATABASE: ${DB_DATABASE}
    MYSQL_PASSWORD: ${DB_PASSWORD}
    MYSQL_USER: ${DB_USERNAME}
</code></pre>
<p><strong>Conclusion</strong>:<br>Embracing Docker Compose on bare metal servers brings forth a plethora of benefits that can significantly enhance application deployment, performance, and management. By reducing overhead, simplifying configurations, and ensuring consistency across environments, Docker Compose stands out as a vital tool for modern IT infrastructure. Whether you&#x2019;re managing complex applications or simple service stacks, Docker Compose paired with bare metal can be a game-changer.</p><p><strong>More Information:</strong></p>
<ul>
<li>Installing Docker (Ubuntu) - <a href="https://docs.docker.com/engine/install/ubuntu/">https://docs.docker.com/engine/install/ubuntu/</a></li>
<li>Installing Compose Plugin (Linux) - <a href="https://docs.docker.com/compose/install/linux/">https://docs.docker.com/compose/install/linux/</a></li>
<li>How Compose Works - <a href="https://docs.docker.com/compose/compose-application-model/">https://docs.docker.com/compose/compose-application-model/</a></li>
</ul>
]]></content:encoded></item></channel></rss>