My Cybersecurity Projects

Docker & Kubernetes Security (Part 1): Building and Securing a Phishing URL Scanner Locally

Yong Xiang Wong — Wed, 15 Apr 2026 15:47:58 GMT

In this series, I’m learning Docker and Kubernetes security by building a phishing URL scanner and applying security practices along the way. This is Part 1, where everything runs locally on Minikube, giving me a space to experiment with core security concepts before taking the project to AWS.

The phishing URL scanner is a Golang web application that accepts a URL, analyses it against a set of detection rules, and returns a verdict. It runs as a containerized workload inside a local Kubernetes cluster provisioned with Minikube, backed by a Postgres database for storing scan results and history.

The source code of the project is available on GitHub.

Architecture Overview

Tech stack: Golang · Docker · Kubernetes · Minikube · Postgres

Kubernetes architecture diagram

The Phishing URL Scanner

The scanner is a REST API built in Golang using the Gin web framework. When a scan request comes in, it flows through a Gin handler into the core scanning logic, which does three things in parallel. First, it submits the URL to the VirusTotal API and checks how many security vendors have flagged it. Second, it checks the URL against the Google Safe Browsing API, which maintains a constantly updated list of known phishing and malware sites. Third, it performs a domain age check by querying RDAP records, as most phishing campaigns uses newly registered domain. The results of all three checks are aggregated into a single verdict and stored into a Postgres database via GORM.

I chose Golang for the application as I am regularly reviewing Golang code in my current job, and I wanted to deepen my understanding of the language. Building a real service from scratch allowed me to better understand how Go applications are structured, how requests flow through a Gin handler into a service layer, and how GORM abstracts database interactions. This hands-on experience gave me a clearer idea of how the Golang applications that I review in my job actually work.

Containerizing with Docker

As I previously did not have much experience with Docker, I wanted to first get a working Docker container before diving into the security side of things. My first Dockerfile was relatively simple. I set up a multi-stage build that compiled the binary in the builder stage, and run the binary using Alpine as the base image so that the final image size is small. I also used Docker Compose to spin up both the application and Postgres in a single command to speed up local development. At this point in time, I also set up a CI pipeline using GitHub Action to build the Docker image and push it to Docker Hub.

Securing the Container

My next step was to implement security into my Dockerfile. I started with switching the image from using Alpine to using a Distroless image, which contains only the application and its runtime dependencies, with no shell, no package manager, and no debugging tools. This significantly limits an attacker's capability if they somehow managed to gain code execution inside the container. Next, I stripped the debug information at compile time by using the build flags -ldflags="-s -w -buildid=". By removing the symbol table and DWARF debug information, it makes it much harder for an attacker to reverse engineer the binary and identify exploitable paths. Finally, I configured the application process to run as a non-root user, so that in case a container vulnerability exist, the escaped process runs as an unprivileged user in the host, limiting the blast radius of the escape.

Apart from hardening the Dockerfile, the CI/CD pipeline is another place where security can be baked in. I configured Semgrep in my GitHub Actions to scan my Golang code at every pull request using the p/golang ruleset, along with Gitleaks for secret scanning. The merge is prevented if Semgrep has any finding at ERROR severity or when a secret is detected. This helps me catch any security issues in my source code before they make it into a built image. I also set up Trivy in the CI/CD pipeline to scan the built container image for known CVEs in OS packages and language dependencies. If Trivy finds any vulnerability rated HIGH or CRITICAL, the image will be blocked from being pushed onto Docker Hub.

Kubernetes

I decided to use Minikube to run Kubernetes locally so that I can learn about Kubernetes by playing around with it. Initially, I chose to keep it simple and create one single namespace for both my application and the Postgres database, a decision that eventually caused me some trouble down the road. I configured Postgres to run as a StatefulSet as it made sense for databases to have a persistent volume that is preserved should the pod ever restart. To manage traffic routing to my app, I used an nginx ingress controller, which helps me to route incoming requests to the appropriate services based on the rules defined.

For my application, I configured a Horizontal Pod Autoscaler to watch CPU utilisation and automatically scale the number of pods based on the load. This helps with the reliability of the application as it reduces the risk of the application overloading and crashing under sustained load or a DOS attack. I played around with it by running a load generator against the application, and it was quite interesting to see the number of pods go up when load increases, and eventually going back down after I stopped the load generator.

Kubernetes Security

Starting from pod security, I implemented pod security context at both the pod level and the container level for my application. At pod level, I configured runAsNonRoot: true and runAsUser: 1001 to ensure that the container process never runs as root. I also added seccompProfile: RuntimeDefault to restrict the set of Linux system calls the container is allowed to make, reducing the kernel attack surface. At the container level, I had allowPrivilegeEscalation: false to prevent the process from gaining more privileges than it is allowed to have, and readOnlyRootFilesystem: true to prevent an attacker from dropping files or modifying the container environment at runtime. I also dropped all Linux capabilities to further remove the unrequired privilege that the container has.

Next, I enforced Pod Security Admission at the namespace level by applying a restricted enforcement level to the namespace. This is where my initial decision to only have a single namespace came back to bite me. My Postgres database failed to start as I did not configure pod security context for it, and I quickly realised that things was not as simple as adding the security context onto Postgres to fit the enforcement. The official Postgres image requires root level access to initialise the database and set file permission, which the restricted policy explicitly blocks. To solve this issue, I decided to create another namespace for Postgres that runs under baseline enforcement, so that my application can preserve a strong security posture while giving Postgres the permissions it actually needs.

For network security, I added a Network Policy to my application, only allowing ingress traffic from the nginx ingress controller and egress traffic to the Postgres pod through port 53. Since my application does not need to communicate with the Kubernetes API at all, I created a dedicated service account for it with the default token automount disabled. These settings will help to limit the capability of an attacker to conduct lateral movement in case of the pod getting compromised.

Conclusion

Building and securing a phishing URL scanner from scratch has really helped me learn about Docker and Kubernetes from both operational and security perspectives. By experimenting with different configurations, I gained a level of understanding that wouldn’t have come from reading the documentation alone. Honestly, I was surprised by how much privilege containers have by default, and how much configuration is required to properly enforce the principle of least privilege. With Part 1 complete, I look forward to starting Part 2 of the project, where I will deploy the application to AWS EKS.

Breaking into Cloud Security: My Cloud Resume Challenge

Yong Xiang Wong — Sat, 14 Mar 2026 07:33:02 GMT

As a fresh graduate who recently started my career as a cybersecurity engineer, I was given the opportunity to explore cloud infrastructure in my workplace, and I quickly became interested in cloud security. As I previously had little experience with cloud technologies, I began searching for learning resources online.

During this search, I came across the Cloud Resume Challenge, a project designed to teach cloud fundamentals by building a real-world application. It seemed like the perfect opportunity for me not only to learn about cloud technologies, but also to gain exposure to related concepts such as CI/CD pipelines. Without much hesitation, I decided to take on the challenge, focusing on the goal of learning cloud and CI/CD security.

You can find the completed project at my website: resume.wongyx.com

Architecture Overview

For the challenge, I chose AWS as the cloud provider as I have some prior experience working with it in my job. I decided to use Cloudflare as my DNS since it is also my domain registrar.

CI/CD Flow

As for CI/CD, I mainly use GitHub Actions to configure my pipeline, integrating tests along the way to secure my project (this will be discussed in greater details later in the blog).

Step 1: Frontend

Since the main point of my project is to learn about securing the cloud, I decided to cheat a little bit for the HTML and CSS of my website by relying on Claude to generate them for me. These files are stored in an AWS S3 bucket, which serves them as a static website through CloudFront acting as the CDN. An SSL/TLS certificate is provisioned in AWS Certificate Manager to enable HTTPS traffic for the website.

For the security of this part, I configured my S3 bucket to only allow traffic from CloudFront by making use of AWS Origin Access Control. I also enabled DNSSEC on Cloudflare to enhance the security of my DNS.

Step 2: Backend

The backend for my website is hosted on serverless infrastructures, making use of DynamoDB as the database of my visitor counter and AWS Lambda for executing my Python code. The AWS API Gateway is used to receive API calls coming from the JavaScript code embedded in my website whenever someone visits it.

As code testing is not one of my main focus for this project, I decided to keep it simple by only writing a smoke test using Playwright. The test will check that my visitor counter successfully loads on the website and the counter is updated on refresh.

To secure the backend of my project, I added a throttling policy to my API gateway to as a protection against DDOS attacks. I also configured CORS policy on my Lambda code to only allow my domain as the allowed origin. I wanted to configure AWS WAF to further secure my website, but it cost $5 per month to use AWS WAF and I want to keep this project as low cost as possible, so I decided to not implement the WAF in the end.

Step 3: Infrastructure as Code

This is the part where things started to get more complicated for me as I have no prior experience with IaC. I wanted to learn an IaC tool that will be useful for me in my future career, and I ended up choosing Terraform as it is cloud agnostic and widely used in the industry.

As I wanted my project to mimic real world deployment practices, I created two separate AWS accounts for test and production, and provision to the two environments using the same Terraform code. Since I had already deployed the infrastructure manually in the earlier steps of the challenge, I had a clear idea of the resources required. To speed up the process, I used Claude to generate an initial Terraform template for the infrastructure. I then reviewed and modified the generated configuration to ensure it accurately reflected my deployed setup. By studying and refining the generated template, I gradually built an understanding of how Terraform defines infrastructure declaratively and manages relationships between resources.

Initially, Terraform state for the project was stored locally. However, I realized that this approach would not be suitable once I integrated Terraform with a CI/CD pipeline. To address this, I configured the state to be stored remotely in an S3 bucket. In addition, DynamoDB is used for state locking to prevent multiple deployments from modifying the infrastructure simultaneously.

From a security perspective, I ensured that the IAM roles created in AWS follow the principle of least privilege. Each service is granted only the minimum permissions required to interact with other services, reducing the potential attack surface of the infrastructure.

Step 4: CI/CD Pipeline

CI/CD pipeline is relatively new to me as well, so I was quite excited when I finally reached this part. I first set up the pipeline to automate the deployment of my AWS Lambda function along with the infrastructure provisioned using Terraform. When pull requests are merged into the main branch of my GitHub repository, the changes will first be deployed to the test environment, and a smoke test will run. Failure in the smoke test will trigger an automatic rollback to the previous working version. If the smoke test passed, I can then manually trigger the deployment into production, which similarly has the smoke test and rollback implemented.

To allow my CI/CD pipeline to interact with AWS securely, I configured OpenID Connect (OIDC) authentication between GitHub Actions and AWS instead of using long-lived access keys. This eliminates the need to store permanent access keys in the repository or GitHub secrets, reducing the risk of credential leakage while aligning with modern cloud security best practices. I also ensured that the IAM role follows the principle of least privilege by defining fine-grained permissions in the IAM policy, granting access only to the specific AWS resources and actions required for deployment. In particular, I restricted IAM-related permissions to read-only access, preventing the pipeline from creating or modifying IAM roles or policies. This approach limits the potential impact of a compromised workflow while still allowing the CI/CD pipeline to function as intended.

Securing the pipeline

My next step was to implement security into the CI/CD pipeline. I configured a pre-commit hook that will run gitleaks whenever I do a git commit to ensure that I will not accidentally push any secrets to my public repository. I also implemented a GPG key with RSA3072 as the encryption algorithm to sign my commits, while setting my GitHub repository to only accept signed commits.

I decided to configure the SAST tools to run in the pipeline whenever a pull request is created, preventing vulnerable code from being merged into the main branch. I used GitHub CodeQL to analyze my Python code for security vulnerabilities and tfsec to scan my Terraform configuration, failing if any high or critical issues are detected. I also added a rule into GitHub to block the merge if either of the scans failed.

After learning about the Shai-Hulud supply chain attack in the npm ecosystem in September 2025, I also decided to add Software Composition Analysis (SCA) into my CI/CD pipeline. I used Syft to generate an SBOM in the CycloneDX format, which is then analyzed using Grype. Similar to the SAST checks, any findings with high or critical severity will also block the merge into the main branch.

Conclusion

Overall, I found this project to be both challenging and rewarding to work through. It allowed me to explore several areas that I had previously only read about, including cloud infrastructure, CI/CD automation, and security practices in modern development workflows. Building the system end-to-end gave me a deeper appreciation of how security can be integrated throughout the development lifecycle rather than treated as a separate step. In particular, working with cloud infrastructure and implementing security checks in the pipeline gave me a glimpse into the world of cloud security and DevSecOps. It is an area that I find particularly interesting and hope to continue exploring as I grow in my career.