關於網路那些事...

Marketing, SEO, Web trends, Programming tutorial, Web design, and Life event...

Spring AI 應用實戰:別只做聊天機器人,從 ChatClient、Advisors、RAG 到 MCP 的工程化設計

很多人第一次接觸 Spring AI,會把它理解成「Spring Boot 版的 OpenAI SDK」。這個理解不能說完全錯,但如果停在這裡,做出來的應用通常很快就會遇到瓶頸:提示詞越寫越亂、外部工具越接越難控、知識檢索和對話記憶糾纏在一起,最後只能維持一個 demo 可以跑、但很難進入正式系統的狀態。

真正有價值的 Spring AI 應用,不是把 LLM 接進來而已,而是把 LLM 放進一個可以驗證、可以治理、可以維運的 Java 系統裡。Spring AI 的價值正是在這裡:它不是只提供 model call,而是提供一個把聊天模型、檢索流程、工具調用、上下文攔截與 Spring Boot 工程能力接起來的應用框架。

這篇文章不打算重複官方文件的功能列表,而是從工程角度回答一個更重要的問題:如果你真的要用 Spring AI 做內部知識助理、客服協作、工作流代理或企業工具入口,應該怎麼設計,才不會三個月後整套系統變成不可維護的 prompt 泥沼?

先看一個比較符合真實場景的架構:

spring-ai-app/
 ├─ api layer
 │   └─ ChatController
 ├─ orchestration layer
 │   ├─ ChatClient
 │   ├─ Advisors
 │   └─ Prompt policies
 ├─ knowledge layer
 │   ├─ VectorStore
 │   ├─ QuestionAnswerAdvisor
 │   └─ RetrievalAugmentationAdvisor
 ├─ tool layer
 │   ├─ Internal business tools
 │   └─ MCP client / MCP server
 ├─ policy layer
 │   ├─ Security
 │   ├─ Auditing
 │   └─ Guardrails
 └─ ops layer
     ├─ Metrics / logs / traces
     ├─ Evaluation
     └─ Release workflow

Continue Reading

How to Engineer a Serious AI Delivery Framework with OpenSpec, Skills, Hooks, Rules, and Templates

Most AI agent frameworks look impressive in a demo and weak in a real delivery pipeline. They can generate code, draft plans, and even produce tests, but they often fail at the harder engineering problem: keeping implementation, intent, validation, and release evidence aligned over time.

That is the gap this article focuses on. Here, OpenSpec specifically means the Fission AI project at github.com/Fission-AI/OpenSpec, a spec-driven development framework for AI coding assistants. OpenSpec is not just “a place to write specs.” It is a structured workflow for managing change through artifacts, delta specs, lifecycle commands, and tool-aware integrations.

If you are trying to build an AI delivery framework that senior engineers can trust, OpenSpec can be a strong backbone. But it is only one layer. To make the whole system work, you still need skills, rules, hooks, templates, and a deliberate validation model.

Here is the architecture we actually care about:

governed-ai-delivery/
 ├─ openspec/
 │   ├─ specs/                      <-- current system behavior by domain
 │   └─ changes/                    <-- proposal, design, tasks, delta specs
 ├─ skills/                         <-- role-specific execution guidance
 ├─ rules/                          <-- persistent engineering constraints
 ├─ hooks/                          <-- deterministic lifecycle automation
 ├─ templates/                      <-- stable artifact shapes
 ├─ validations/
 │   ├─ test-strategy/
 │   ├─ policy-checks/
 │   └─ release-evidence/
 └─ ci/
     └─ pipelines/                  <-- reproducible validation and release gates

Continue Reading

A Pragmatic Guide to Spring Boot Architecture: Choosing Between Three-Layer and Clean Architecture

Software architecture is the foundational blueprint that determines the long-term success of an application. For developers working with Spring Boot, architectural choices have a direct impact on maintainability, scalability, and overall development productivity. A poorly chosen structure can result in a codebase that is hard to test, expensive to modify, and slow to evolve. Among the available patterns, Three-Layer Architecture and Clean Architecture are two of the most frequently used approaches. The former emphasizes simplicity and delivery speed, while the latter focuses on domain-centric design and long-term flexibility. This guide provides a clear, comparative analysis of both architectures, complete with practical examples, to help teams make informed decisions based on their project’s goals and constraints.

Continue Reading

Hexagonal Architecture (Ports & Adapters) explanation with Spring Boot examples:

Hexagonal Architecture

Hexagonal Architecture (HA), also known as Ports and Adapters Architecture, is a foundational and influential software design pattern, first introduced by Alistair Cockburn in 2005.

The term “Hexagonal Architecture” comes from a visual convention: the application component is drawn as a hexagon, not to imply it must have six boundaries or ports, but to leave enough space to represent the different interfaces connecting the component to the outside world. As Cockburn stated in his 2005 article:

The hexagon is not a hexagon because the number six is important, but rather to allow the people doing the drawing to have room to insert ports and adapters as they need, not being constrained by a one-dimensional layered drawing.

This insight is crucial—it reminds us to stop focusing on the shape and instead focus on the core concept: defining the application’s API via ports and connecting it to the outside world with interchangeable adapters. This shift from diagram to intent is the first step toward architectural maturity.

Ports & Adapters (Hexagonal Architecture) explicitly defines two layers: the inside (application core) and the outside (everything else), and requires clear definition of ports for interaction.

(Clean Architecture builds on this by further splitting the application core into more granular layers such as Use Cases, Entities, and Domain Services, but is less prescriptive about the “port” metaphor for the outer boundary.)

Continue Reading

Design Multiple Environment Configuration on Spring Boot

A Complete Guide to Multi-Environment Configuration in Spring Boot

In real-world development and deployment, a single project often needs to run in multiple environments (e.g., local, dev, staging, production).
Spring Boot provides a powerful mechanism for environment-based configuration, allowing us to automatically load different settings based on runtime environment variables.

This article walks through how to design a clean and maintainable multi-environment configuration structure step by step.

Continue Reading

A Guide to Multi-Module Projects in Spring Boot

A Multi-module Maven Project is a software architecture pattern that allows developers to split a large project into multiple interconnected sub-modules, all managed by a single parent project.

This parent-child Maven structure aligns well with microservices architecture. For example, a parent project can contain gateway and auth-service sub-modules. After packaging, these two modules can be deployed and maintained separately.

It’s important to note that this parent-child relationship is for build management, not a functional hierarchy. The parent POM centralizes dependency management, but it doesn’t mean gateway necessarily depends on auth-service.

Programmatic Dependencies: If the gateway module needs to use classes from auth-service, you would add a dependency in its pom.xml. However, in a typical microservices setup, gateway and auth-service communicate via HTTP APIs or OAuth tokens, eliminating the need for a direct JAR dependency.

A typical multi-module project structure in Maven looks like this:

workspace/
 ├─ pom.xml           <-- Parent POM, defines dependencyManagement, pluginManagement, versions
 ├─ gateway/          <-- Child module
 │   └─ pom.xml
 └─ auth-service/     <-- Child module
     └─ pom.xml

Continue Reading

Robust SpringBoot Deployments: Health Checks, Versioning, and Makefile Automation

In a microservices architecture, service health checks and a reliable Linux deployment process are key to operational stability. This article will share how to build a health check interface in Spring Boot, combined with version management, and how to deploy and manage a Spring Boot JAR in a Linux environment to achieve automated execution and monitoring.

This article will explain how to build a health check interface in Spring Boot, manage versions using Maven BuildProperties, and use a Makefile for streamlined operations, automating the process from development to deployment. Finally, it will cover running the JAR on Linux and managing it with Supervisor.

Continue Reading

A Professional Guide to MyBatis: From Basics to Advanced Techniques

What is MyBatis?

MyBatis is a first-class persistence framework that provides an alternative to traditional Object-Relational Mapping (ORM) solutions like Hibernate or JPA. Often referred to as a “SQL Mapper,” MyBatis distinguishes itself by embracing SQL. Instead of abstracting SQL away, it puts developers in full control, mapping SQL statements to Java methods.

The core philosophy of MyBatis is to decouple SQL from application logic while allowing you to leverage the full power of your database. It achieves this by mapping:

  • Result Mapping: The results of a SQL query to Java objects.
  • Parameter Mapping: Java objects and parameters to SQL statement placeholders.

This approach makes MyBatis an excellent choice for projects that require precise control over SQL, such as financial systems, high-performance transaction platforms, or applications with complex, performance-sensitive queries.

In a Spring Boot ecosystem, MyBatis can be integrated primarily in two ways:

  1. Mapper Interface + XML (Recommended): Offers maximum flexibility for writing and maintaining complex SQL.
  2. Mapper Interface + Annotations: Suitable for simple queries and smaller projects.

Continue Reading

Implementing an OAuth2 Authorization Server with Spring Boot and MyBatis

In modern application development, secure authentication and authorization are crucial components. This article demonstrates how to implement a Login Service using:

  • Spring Boot (REST API, MVC pattern)
  • Spring Security + Authorization Server (OAuth2.1)
  • MyBatis (database persistence)
  • MySQL (user and token storage)

The system acts as:

  1. OAuth2 Authorization Server (issue JWT access tokens & refresh tokens).
  2. OAuth2 Client (support login with Google in the future).
  3. Resource Server validator (other services can validate tokens issued here).

Continue Reading

為 GKE 應用建立 CI/CD 流程:自動化部署與更新

在前兩篇文章中,我們成功建立了一個完整的三層架構應用:MySQL、Redis 和 Golang 後端服務,並將它們部署到 GKE 上。然而,手動部署在實際開發中並不實用,特別是當團隊規模擴大或需要頻繁發布時。這篇文章將介紹如何建立一個完整的 CI/CD 流程,實現程式碼變更後的自動測試、建置、部署和更新。

Continue Reading

使用 Helm 部署 MySQL、Redis 和 Golang 應用到 GKE

在上一篇文章中,我們成功使用 Helm 將 Nginx 部署到 GKE 上。這次,我們將更進一步,建立一個完整的三層架構應用:包含 MySQL 資料庫、Redis 快取服務,以及一個 Golang 後端應用。這個實作將更貼近實際的生產環境部署需求。

Continue Reading

使用 Helm 將 Nginx 部署到 GCP GKE

最近進到新專案,需要開始熟悉 GCP 相關服務,順手先嘗試將一個 Nginx 應用部署到 Google Kubernetes Engine (GKE) 上。過程中,使用 Helm 來簡化與管理部署流程,將過程記錄下來分享。

Continue Reading

Gemini Cli: GenAI dev game with Mcp Context7 and Tasks Master Ai

In this article, we will explore how to use the Gemini CLI to develop a 2D English word game by integrating it with Context 7 and Task Master AI. This setup allows for efficient task management and access to the latest documentation, enhancing the development process.

Find the game demo here: Word Cloud Shooter

Continue Reading

Meta-Prompting: AutoGen to develop alphabet game by augment

When developing software—especially as codebases grow to thousands or even tens of thousands of lines—developers often encounter hard-to-fix bugs, or fixing one issue may introduce new problems. Even simple projects can require significant time to implement all desired features. Many existing AI programming tools, such as Cursor, are limited by their context window, typically supporting only about 10K tokens, which is roughly 400 to 600 lines of standard Python code. This short context length restricts the AI’s understanding of the entire project, making it difficult to identify key dependencies and architectural information. As a result, these tools can only handle single files or small-scale projects.

Continue Reading

GCP: Install Google Cloud SDK on Mac and upload image to Cloud Storage

Introduction

This article provides a guide on how to install and configure the Google Cloud SDK (gcloud) on a Mac, including how to authenticate, create projects, set regions and zones, and manage Google Cloud Storage (GCS) buckets and objects using gcloud and gsutil commands.

Continue Reading

Building a 2048 Game with Roo Code: A Hands-on Vibe Coding Experience

In this post, I’ll share how I built a classic 2048 game from scratch by combining the Vibe Coding workflow with Roo Code’s AI-powered collaborative development features. Through real project steps, practical tips, and personal notes, you’ll get a glimpse of what modern AI-assisted development feels like — and how it points toward a future where developers and AI co-create software seamlessly.

Check out the live demo here: https://game-2048.hoohoo.top/

What is Vibe Coding?

Vibe Coding is an emerging development methodology that emphasizes visualization, real-time interaction, and AI collaboration. The goal is to lower the barrier to entry for coding, shorten development cycles, and improve code quality all at once.

Continue Reading

Testing Lambda Runtime in AWS CloudShell with SAM and Docker Networks

Recently, we received AWS notifications that certain Node.js 18.x runtimes for our Lambda functions are approaching EOL and need to be upgraded. As part of our standard process, we first validate the new runtime locally before performing the actual update.

We’ve encountered issues where macOS on Apple Silicon (M1/M2) cannot reliably run Node.js 20.x. To work around this, we perform our runtime tests directly in AWS CloudShell. Since CloudShell does not support host.docker.internal for Docker networking, we’ve documented a workaround here.

Continue Reading

Enhanced Use Case for single-node Node.js to Cluster Module

Improving single-node performance in Node.js is a common challenge developers face, as Node.js’s single-threaded nature can limit its ability to handle high concurrency or heavy load in some applications. To enhance performance, you can consider a variety of solutions and techniques. Below are some common methods:

Using Multi-process or Multi-threading Techniques

(1) Node.js Cluster Module: The cluster module in Node.js allows for multi-process applications, enabling each CPU core to run a separate Node.js process. This can significantly improve the application’s concurrency performance.

(2) Worker Threads: The built-in Worker Threads in Node.js help you take advantage of multi-core CPUs for parallel computation. You can use Worker Threads to handle long-running computational tasks without blocking the main thread, thereby enhancing concurrency handling.

(3) PM2 for Monitoring and Multi-process Management: Using PM2 to configure multi-process mode for starting and managing Node.js processes can improve the application’s stability and resilience. PM2 automatically handles process restarts, load balancing, and other settings, making it easier to manage Node.js applications in production.

In this article, we will focus on introducing the Cluster module.

Continue Reading

Efficient Data Management in Redis: Leveraging Hashes and Sets for One-to-Many Relationships

When designing a caching solution in Redis for a one-to-many relationship, such as a quiz with multiple member answers, it’s important to choose the right data structures to ensure efficient storage and retrieval. Redis offers several data structures that can be used to achieve this, including hashes and sets. Below, we discuss two common approaches: using hashes and using sets combined with hashes.

Continue Reading

Introduction to Grafana K6: Efficient Load Testing Tool Use Cases

Introduction

Grafana K6 is a highly efficient load testing tool built with JavaScript (with a Golang core) that maximizes the load capabilities of a single machine. According to the official documentation, a single K6 process can effectively utilize all CPU cores, and under ideal conditions, it can simulate 30,000–40,000 virtual users (VUs). This is typically sufficient to handle 100,000–300,000 requests per second (RPS), translating to 6–12 million requests per minute. It enables more efficient load testing without requiring additional hardware resources.

Continue Reading