在前兩篇文章中,我們成功建立了一個完整的三層架構應用:MySQL、Redis 和 Golang 後端服務,並將它們部署到 GKE 上。然而,手動部署在實際開發中並不實用,特別是當團隊規模擴大或需要頻繁發布時。這篇文章將介紹如何建立一個完整的 CI/CD 流程,實現程式碼變更後的自動測試、建置、部署和更新。

CI/CD 架構概述

我們將建立一個現代化的 CI/CD 流程,包含以下階段:

  1. 程式碼管理: GitHub Repository
  2. 持續整合 (CI): GitHub Actions 進行自動測試和建置
  3. 容器映像管理: Google Container Registry (GCR) 或 Artifact Registry
  4. 持續部署 (CD): 自動部署到不同環境 (Development、Staging、Production)
  5. 監控與回滾: 部署後的健康檢查和自動回滾機制

前置準備

在開始之前,請確認您已準備好以下資源:

  1. GitHub Repository: 存放我們的應用程式碼
  2. GCP 服務帳戶: 具備必要權限的服務帳戶
  3. 多個 GKE 環境: Development、Staging、Production
  4. 上一篇文章的完整程式碼: MySQL、Redis、Golang 應用

步驟一:建立多環境 GKE Clusters

首先,我們需要建立不同的環境來支援 CI/CD 流程。

建立環境 Clusters

# Development 環境 (較小的資源配置)
gcloud container clusters create-auto dev-cluster \
  --region=us-central1 \
  --cluster-version=latest

# Staging 環境 (模擬生產環境)
gcloud container clusters create-auto staging-cluster \
  --region=us-central1 \
  --cluster-version=latest

# Production 環境 (高可用性配置)
gcloud container clusters create-auto prod-cluster \
  --region=us-central1 \
  --cluster-version=latest

設定 kubectl contexts

# 取得各環境的憑證
gcloud container clusters get-credentials dev-cluster --region us-central1
gcloud container clusters get-credentials staging-cluster --region us-central1
gcloud container clusters get-credentials prod-cluster --region us-central1

# 重新命名 context 以便識別
kubectl config rename-context gke_PROJECT_ID_us-central1_dev-cluster dev
kubectl config rename-context gke_PROJECT_ID_us-central1_staging-cluster staging
kubectl config rename-context gke_PROJECT_ID_us-central1_prod-cluster prod

步驟二:建立服務帳戶和權限

為 CI/CD 流程建立一個具備必要權限的服務帳戶。

建立服務帳戶

# 建立服務帳戶
gcloud iam service-accounts create cicd-service-account \
  --display-name="CI/CD Service Account"

# 賦予必要權限
gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:cicd-service-account@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/container.developer"

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:cicd-service-account@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/storage.admin"

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:cicd-service-account@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/cloudbuild.builds.builder"

# 產生服務帳戶金鑰
gcloud iam service-accounts keys create cicd-key.json \
  --iam-account=cicd-service-account@PROJECT_ID.iam.gserviceaccount.com

步驟三:建立 GitHub Repository 結構

組織我們的程式碼結構以支援 CI/CD 流程。

專案目錄結構

myapp-k8s/
├── .github/
│   └── workflows/
│       ├── ci.yml
│       └── cd.yml
├── apps/
│   └── golang-app/
│       ├── main.go
│       ├── go.mod
│       ├── go.sum
│       ├── Dockerfile
│       └── tests/
│           └── main_test.go
├── helm/
│   ├── golang-app/
│   │   ├── Chart.yaml
│   │   ├── values.yaml
│   │   ├── values-dev.yaml
│   │   ├── values-staging.yaml
│   │   ├── values-prod.yaml
│   │   └── templates/
│   ├── mysql/
│   │   └── values.yaml
│   └── redis/
│       └── values.yaml
├── k8s/
│   ├── namespaces/
│   ├── secrets/
│   └── configmaps/
└── scripts/
    ├── deploy.sh
    └── rollback.sh

步驟四:建立環境特定的 Helm Values

為不同環境建立對應的 Helm 設定檔。

Development 環境 (values-dev.yaml)

# helm/golang-app/values-dev.yaml
replicaCount: 1

image:
  repository: gcr.io/PROJECT_ID/golang-app
  tag: "dev-latest"

service:
  type: ClusterIP  # 內部測試用

resources:
  requests:
    cpu: 50m
    memory: 64Mi
  limits:
    cpu: 100m
    memory: 128Mi

env:
  - name: MYSQL_HOST
    value: "mysql-dev"
  - name: MYSQL_USER
    value: "appuser"
  - name: MYSQL_PASSWORD
    valueFrom:
      secretKeyRef:
        name: mysql-secret-dev
        key: password
  - name: MYSQL_DATABASE
    value: "myappdb_dev"
  - name: REDIS_HOST
    value: "redis-master-dev"
  - name: REDIS_PASSWORD
    valueFrom:
      secretKeyRef:
        name: redis-secret-dev
        key: password

ingress:
  enabled: false

Staging 環境 (values-staging.yaml)

# helm/golang-app/values-staging.yaml
replicaCount: 2

image:
  repository: gcr.io/PROJECT_ID/golang-app
  tag: "staging-latest"

service:
  type: LoadBalancer

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 200m
    memory: 256Mi

env:
  - name: MYSQL_HOST
    value: "mysql-staging"
  - name: MYSQL_USER
    value: "appuser"
  - name: MYSQL_PASSWORD
    valueFrom:
      secretKeyRef:
        name: mysql-secret-staging
        key: password
  - name: MYSQL_DATABASE
    value: "myappdb_staging"
  - name: REDIS_HOST
    value: "redis-master-staging"
  - name: REDIS_PASSWORD
    valueFrom:
      secretKeyRef:
        name: redis-secret-staging
        key: password

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: "gce"
  hosts:
    - host: staging-api.yourcompany.com
      paths:
        - path: /
          pathType: Prefix

Production 環境 (values-prod.yaml)

# helm/golang-app/values-prod.yaml
replicaCount: 3

image:
  repository: gcr.io/PROJECT_ID/golang-app
  tag: "prod-latest"

service:
  type: LoadBalancer

resources:
  requests:
    cpu: 200m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

env:
  - name: MYSQL_HOST
    value: "mysql-prod"
  - name: MYSQL_USER
    value: "appuser"
  - name: MYSQL_PASSWORD
    valueFrom:
      secretKeyRef:
        name: mysql-secret-prod
        key: password
  - name: MYSQL_DATABASE
    value: "myappdb_prod"
  - name: REDIS_HOST
    value: "redis-master-prod"
  - name: REDIS_PASSWORD
    valueFrom:
      secretKeyRef:
        name: redis-secret-prod
        key: password

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: "gce"
    kubernetes.io/ingress.global-static-ip-name: "prod-ip"
  hosts:
    - host: api.yourcompany.com
      paths:
        - path: /
          pathType: Prefix

# 生產環境的額外設定
podDisruptionBudget:
  enabled: true
  minAvailable: 2

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

步驟五:建立 GitHub Actions CI 流程

建立持續整合流程,包含測試、建置和映像推送。

CI 工作流程 (.github/workflows/ci.yml)

# .github/workflows/ci.yml
name: CI Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

env:
  PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
  GAR_LOCATION: us-central1
  REPOSITORY: myapp-repo
  IMAGE: golang-app

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Set up Go
      uses: actions/setup-go@v4
      with:
        go-version: '1.21'

    - name: Cache Go modules
      uses: actions/cache@v3
      with:
        path: ~/go/pkg/mod
        key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
        restore-keys: |
          ${{ runner.os }}-go-

    - name: Run tests
      run: |
        cd apps/golang-app
        go mod tidy
        go test -v ./...

    - name: Run linting
      run: |
        cd apps/golang-app
        go vet ./...

    - name: Check code formatting
      run: |
        cd apps/golang-app
        test -z $(gofmt -l .)

  build:
    needs: test
    runs-on: ubuntu-latest
    
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
      image-digest: ${{ steps.build.outputs.digest }}
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v1
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Configure Docker to use gcloud
      run: gcloud auth configure-docker gcr.io

    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v4
      with:
        images: gcr.io/${{ env.PROJECT_ID }}/${{ env.IMAGE }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=sha,prefix={{branch}}-
          type=raw,value=latest,enable={{is_default_branch}}

    - name: Build and push Docker image
      id: build
      uses: docker/build-push-action@v4
      with:
        context: ./apps/golang-app
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}

  security-scan:
    needs: build
    runs-on: ubuntu-latest
    
    steps:
    - name: Run Trivy vulnerability scanner
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: gcr.io/${{ env.PROJECT_ID }}/${{ env.IMAGE }}:${{ github.sha }}
        format: 'sarif'
        output: 'trivy-results.sarif'

    - name: Upload Trivy scan results
      uses: github/codeql-action/upload-sarif@v2
      with:
        sarif_file: 'trivy-results.sarif'

步驟六:建立 GitHub Actions CD 流程

建立持續部署流程,支援多環境部署。

CD 工作流程 (.github/workflows/cd.yml)

# .github/workflows/cd.yml
name: CD Pipeline

on:
  workflow_run:
    workflows: ["CI Pipeline"]
    types: [completed]
    branches: [main, develop]

env:
  PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
  GAR_LOCATION: us-central1

jobs:
  deploy-dev:
    if: github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.head_branch == 'develop'
    runs-on: ubuntu-latest
    environment: development
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v1
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Set up Cloud SDK
      uses: google-github-actions/setup-gcloud@v1

    - name: Get GKE credentials
      run: |
        gcloud container clusters get-credentials dev-cluster --region us-central1

    - name: Set up Helm
      uses: azure/setup-helm@v3
      with:
        version: '3.12.0'

    - name: Add Helm repositories
      run: |
        helm repo add bitnami https://charts.bitnami.com/bitnami
        helm repo update

    - name: Deploy infrastructure to Dev
      run: |
        # Deploy MySQL
        helm upgrade --install mysql-dev bitnami/mysql \
          --namespace dev --create-namespace \
          -f helm/mysql/values.yaml \
          --set auth.database=myappdb_dev

        # Deploy Redis
        helm upgrade --install redis-dev bitnami/redis \
          --namespace dev \
          -f helm/redis/values.yaml

    - name: Deploy application to Dev
      run: |
        helm upgrade --install golang-app-dev helm/golang-app \
          --namespace dev \
          -f helm/golang-app/values-dev.yaml \
          --set image.tag=develop-${{ github.sha }}

    - name: Run health checks
      run: |
        kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=golang-app \
          --namespace dev --timeout=300s
        
        # Simple health check
        kubectl exec -n dev deployment/golang-app-dev -- \
          curl -f http://localhost:8080/health || exit 1

  deploy-staging:
    if: github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.head_branch == 'main'
    runs-on: ubuntu-latest
    environment: staging
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v1
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Set up Cloud SDK
      uses: google-github-actions/setup-gcloud@v1

    - name: Get GKE credentials
      run: |
        gcloud container clusters get-credentials staging-cluster --region us-central1

    - name: Set up Helm
      uses: azure/setup-helm@v3
      with:
        version: '3.12.0'

    - name: Add Helm repositories
      run: |
        helm repo add bitnami https://charts.bitnami.com/bitnami
        helm repo update

    - name: Deploy infrastructure to Staging
      run: |
        # Deploy MySQL
        helm upgrade --install mysql-staging bitnami/mysql \
          --namespace staging --create-namespace \
          -f helm/mysql/values.yaml \
          --set auth.database=myappdb_staging

        # Deploy Redis
        helm upgrade --install redis-staging bitnami/redis \
          --namespace staging \
          -f helm/redis/values.yaml

    - name: Deploy application to Staging
      run: |
        helm upgrade --install golang-app-staging helm/golang-app \
          --namespace staging \
          -f helm/golang-app/values-staging.yaml \
          --set image.tag=main-${{ github.sha }}

    - name: Run integration tests
      run: |
        kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=golang-app \
          --namespace staging --timeout=300s
        
        # Get service URL
        STAGING_URL=$(kubectl get svc golang-app-staging -n staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
        
        # Run integration tests
        curl -f http://$STAGING_URL/health
        curl -X POST http://$STAGING_URL/users -H "Content-Type: application/json" -d '{"name": "Test User"}'
        curl -f http://$STAGING_URL/users

  deploy-prod:
    needs: deploy-staging
    runs-on: ubuntu-latest
    environment: production
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v1
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Set up Cloud SDK
      uses: google-github-actions/setup-gcloud@v1

    - name: Get GKE credentials
      run: |
        gcloud container clusters get-credentials prod-cluster --region us-central1

    - name: Set up Helm
      uses: azure/setup-helm@v3
      with:
        version: '3.12.0'

    - name: Add Helm repositories
      run: |
        helm repo add bitnami https://charts.bitnami.com/bitnami
        helm repo update

    - name: Deploy infrastructure to Production
      run: |
        # Deploy MySQL with backup configuration
        helm upgrade --install mysql-prod bitnami/mysql \
          --namespace production --create-namespace \
          -f helm/mysql/values.yaml \
          --set auth.database=myappdb_prod \
          --set primary.persistence.size=50Gi \
          --set metrics.enabled=true

        # Deploy Redis with persistence
        helm upgrade --install redis-prod bitnami/redis \
          --namespace production \
          -f helm/redis/values.yaml \
          --set master.persistence.size=20Gi

    - name: Blue-Green deployment to Production
      run: |
        # Deploy new version with blue-green strategy
        helm upgrade --install golang-app-prod helm/golang-app \
          --namespace production \
          -f helm/golang-app/values-prod.yaml \
          --set image.tag=main-${{ github.sha }} \
          --wait --timeout=600s

    - name: Run production health checks
      run: |
        kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=golang-app \
          --namespace production --timeout=300s
        
        # Comprehensive health checks
        PROD_URL=$(kubectl get svc golang-app-prod -n production -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
        
        for i in {1..10}; do
          curl -f http://$PROD_URL/health || exit 1
          sleep 5
        done

    - name: Send deployment notification
      if: success()
      run: |
        echo "✅ Production deployment successful!"
        echo "Version: main-${{ github.sha }}"
        echo "Environment: Production"
        # 可以加入 Slack 或其他通知服務

步驟七:建立自動回滾機制

建立一個可以快速回滾的腳本和工作流程。

回滾腳本 (scripts/rollback.sh)

#!/bin/bash
# scripts/rollback.sh

set -e

ENVIRONMENT=$1
REVISION=$2

if [ -z "$ENVIRONMENT" ] || [ -z "$REVISION" ]; then
    echo "Usage: $0 <environment> <revision>"
    echo "Example: $0 prod 2"
    exit 1
fi

case $ENVIRONMENT in
    "dev")
        CLUSTER="dev-cluster"
        NAMESPACE="dev"
        RELEASE="golang-app-dev"
        ;;
    "staging")
        CLUSTER="staging-cluster"
        NAMESPACE="staging"
        RELEASE="golang-app-staging"
        ;;
    "prod")
        CLUSTER="prod-cluster"
        NAMESPACE="production"
        RELEASE="golang-app-prod"
        ;;
    *)
        echo "Unknown environment: $ENVIRONMENT"
        exit 1
        ;;
esac

echo "Rolling back $RELEASE in $ENVIRONMENT to revision $REVISION..."

# Switch to the correct cluster
gcloud container clusters get-credentials $CLUSTER --region us-central1

# Perform rollback
helm rollback $RELEASE $REVISION --namespace $NAMESPACE

# Wait for rollback to complete
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=golang-app \
    --namespace $NAMESPACE --timeout=300s

echo "Rollback completed successfully!"

# Verify health
kubectl get pods -n $NAMESPACE
helm status $RELEASE -n $NAMESPACE

緊急回滾工作流程 (.github/workflows/rollback.yml)

# .github/workflows/rollback.yml
name: Emergency Rollback

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Environment to rollback'
        required: true
        type: choice
        options:
          - dev
          - staging
          - prod
      revision:
        description: 'Revision number to rollback to'
        required: true
        type: string

jobs:
  rollback:
    runs-on: ubuntu-latest
    environment: ${{ github.event.inputs.environment }}
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Authenticate to Google Cloud
      uses: google-github-actions/auth@v1
      with:
        credentials_json: ${{ secrets.GCP_SA_KEY }}

    - name: Set up Cloud SDK
      uses: google-github-actions/setup-gcloud@v1

    - name: Set up Helm
      uses: azure/setup-helm@v3
      with:
        version: '3.12.0'

    - name: Execute rollback
      run: |
        chmod +x scripts/rollback.sh
        ./scripts/rollback.sh ${{ github.event.inputs.environment }} ${{ github.event.inputs.revision }}

    - name: Verify rollback
      run: |
        echo "Rollback completed for ${{ github.event.inputs.environment }} environment"
        echo "Revision: ${{ github.event.inputs.revision }}"

步驟八:設定 GitHub Secrets

在 GitHub Repository 設定中,加入必要的 Secrets:

  1. GCP_PROJECT_ID: 您的 GCP 專案 ID
  2. GCP_SA_KEY: 服務帳戶金鑰的 JSON 內容

步驟九:監控與告警

建立監控儀表板和告警機制。

建立監控指標

# k8s/monitoring/servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: golang-app-monitor
  namespace: production
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: golang-app
  endpoints:
  - port: http
    path: /metrics
    interval: 30s

告警規則

# k8s/monitoring/alerting-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: golang-app-alerts
  namespace: production
spec:
  groups:
  - name: golang-app
    rules:
    - alert: HighErrorRate
      expr: |
        rate(http_requests_total{status=~"5.."}[5m]) > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High error rate detected"
        description: "Error rate is above 10% for 5 minutes"

    - alert: PodRestartFrequent
      expr: |
        rate(kube_pod_container_status_restarts_total[15m]) > 0
      for: 0m
      labels:
        severity: warning
      annotations:
        summary: "Pod is restarting frequently"
        description: "Pod {{ $labels.pod }} is restarting frequently"

驗證 CI/CD 流程

測試完整流程

  1. 在 develop 分支進行變更

    # 建立並切換到 develop 分支
    git checkout -b develop
    
    # 修改應用程式(例如加入新的 API endpoint)
    # 提交變更
    git add .
    git commit -m "Add new API endpoint"
    git push origin develop
    
  2. 觀察 CI 流程

    • 自動執行測試、建置、安全掃描
    • 自動部署到 Development 環境
  3. 建立 Pull Request 到 main

    • 觸發 CI 流程
    • 通過檢查後合併到 main
  4. 觀察 CD 流程

    • 自動部署到 Staging 環境
    • 執行整合測試
    • 需要手動核准後部署到 Production

監控部署狀態

# 檢查各環境的部署狀態
kubectl get pods -n dev
kubectl get pods -n staging
kubectl get pods -n production

# 檢查 Helm 部署歷史
helm history golang-app-dev -n dev
helm history golang-app-staging -n staging
helm history golang-app-prod -n production

# 檢查服務狀態
kubectl get svc -n production
kubectl get ingress -n production

最佳實踐建議

  1. 分支策略

    • main 分支用於 Production 部署
    • develop 分支用於 Development 環境
    • Feature 分支用於功能開發
  2. 安全性

    • 使用 least privilege 原則設定服務帳戶權限
    • 定期輪換 secrets
    • 啟用容器映像漏洞掃描
  3. 測試策略

    • 單元測試在 CI 階段執行
    • 整合測試在 Staging 環境執行
    • 生產環境部署前執行 smoke tests
  4. 資源管理

    • 為每個環境設定適當的資源限制
    • 使用 HPA (Horizontal Pod Autoscaler) 處理負載變化
    • 定期清理未使用的資源
  5. 監控與告警

    • 設定關鍵指標的告警
    • 建立 runbook 處理常見問題
    • 定期檢視部署指標和效能

清理資源

當不再需要時,可以使用以下指令清理所有資源:

# 刪除所有環境的應用程式
helm uninstall golang-app-dev -n dev
helm uninstall golang-app-staging -n staging
helm uninstall golang-app-prod -n production

# 刪除基礎設施
helm uninstall mysql-dev -n dev
helm uninstall redis-dev -n dev
helm uninstall mysql-staging -n staging
helm uninstall redis-staging -n staging
helm uninstall mysql-prod -n production
helm uninstall redis-prod -n production

# 刪除 Clusters
gcloud container clusters delete dev-cluster --region us-central1
gcloud container clusters delete staging-cluster --region us-central1
gcloud container clusters delete prod-cluster --region us-central1

透過這個完整的 CI/CD 流程,我們實現了從程式碼提交到生產環境部署的全自動化流程。這不僅提高了開發效率,也確保了部署的一致性和可靠性。在實際專案中,您可以根據團隊需求調整流程,加入更多的測試階段、核准流程或監控機制。

下一步,您可以考慮導入 GitOps 工具(如 ArgoCD)來進一步改善部署流程,或是加入更進階的功能如 Canary 部署、A/B 測試等。