π GitOps: Deep Dive & Best Practices
Concise, clear, and validated revision notes on GitOps (Git, GitHub, GitLab) β structured for beginners and practitioners.
Table of Contents
- Introduction
- Core Concepts
- GitOps Principles
- Git Fundamentals
- GitHub and GitHub Actions
- GitLab CI/CD
- Repository Structure
- Branching Strategies
- CI/CD Pipeline Design
- Infrastructure as Code
- Deployment Strategies
- Security Best Practices
- Monitoring and Observability
- GitOps Tools
- Best Practices
- Common Pitfalls
- Jargon Tables
Introduction
GitOps is a modern operational framework that leverages Git as the single source of truth for declarative infrastructure and application code. It extends DevOps practices by using Git repositories to manage infrastructure configuration and application deployment, enabling teams to deliver software faster, more reliably, and with greater auditability.
Directory Structure Best Practices
Use Folders, Not Branches:
- Avoid environment branches (dev, staging, prod)
- Use directories to organize environments
- Easier to see all variants simultaneously
- Simpler promotion between environments
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
k8s/
βββ base/ # Common configuration
β βββ deployment.yaml
β βββ service.yaml
β βββ kustomization.yaml
βββ overlays/
βββ dev/
β βββ kustomization.yaml
β βββ patch-replicas.yaml
βββ staging/
β βββ kustomization.yaml
β βββ patch-replicas.yaml
βββ prod/
βββ kustomization.yaml
βββ patch-replicas.yaml
WET vs DRY Configuration:
DRY (Donβt Repeat Yourself): Use templates and generators
- Pros: Less repetition, easier updates
- Cons: Harder to review, requires processing
WET (Write Everything Twice): Explicit configuration files
- Pros: Easy to review, no processing needed
- Cons: More files, potential inconsistencies
Recommendation: Use WET for GitOps
- Changes are visible in pull requests
- No hidden logic or transformations
- Config Sync applies exactly whatβs in Git
Branching Strategies
Trunk-Based Development
Recommended for GitOps: Single main branch with short-lived feature branches.
Principles:
- Main branch is always deployable
- Feature branches live < 2 days
- Small, incremental changes
- Continuous integration
- Feature flags for incomplete features
1
2
3
4
main βββββββββββββββββββββββββββββββββββ
\ / \ /
β β β β
feature-1 feature-2
Workflow:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 1. Create feature branch
git checkout -b feature/add-health-check
# 2. Make small changes
vim deployment.yaml
# 3. Commit frequently
git add deployment.yaml
git commit -m "feat: add liveness probe to deployment"
# 4. Push and create PR immediately
git push origin feature/add-health-check
# 5. Merge quickly (within hours)
# 6. Delete branch
git branch -d feature/add-health-check
Environment Promotion
Use Directories, Not Branches:
1
2
3
4
5
6
7
configs/
βββ dev/
β βββ app-config.yaml
βββ staging/
β βββ app-config.yaml
βββ prod/
βββ app-config.yaml
Promotion Process:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 1. Test in dev
git checkout main
cd configs/dev
# make changes, test
# 2. Promote to staging
cp dev/app-config.yaml staging/app-config.yaml
# adjust environment-specific values
git add staging/
git commit -m "chore: promote dev config to staging"
git push
# 3. After validation, promote to prod
cp staging/app-config.yaml prod/app-config.yaml
# adjust environment-specific values
git add prod/
git commit -m "chore: promote staging config to prod"
git push
Release Strategies
Git Flow (Not Recommended for GitOps)
1
2
3
4
5
6
7
main ββββββββββββββββββββββββββββββββββββββ
/ / /
release βββββββββββββββββββββββββββββββββββ
/ / /
develop βββββββββββββββββββββββββββββββββββ
/ \ / \ / \
feature ββββ ββββββββ ββββββββ ββββββ
Why Not for GitOps:
- Multiple long-lived branches
- Complex merge strategies
- Cherry-picking required
- Doesnβt match declarative model
GitHub Flow (Recommended)
1
2
3
4
main βββββββββββββββββββββββββββββββββββ
\ / \ / \ /
β β β β β β
feature-1 feature-2 feature-3
Workflow:
- Branch from main
- Make changes
- Create PR
- Review and test
- Merge to main
- Delete branch
CI/CD Pipeline Design
Pipeline Stages
1
2
3
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
β Code βββββΆβ Build βββββΆβ Test βββββΆβ Deploy β
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
1. Code Stage
Pre-commit Hooks:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# .git/hooks/pre-commit
#!/bin/bash
# Run linting
npm run lint
if [ $? -ne 0 ]; then
echo "Linting failed. Commit aborted."
exit 1
fi
# Run tests
npm test
if [ $? -ne 0 ]; then
echo "Tests failed. Commit aborted."
exit 1
fi
exit 0
Pre-push Hooks:
1
2
3
4
5
6
7
8
9
10
11
# .git/hooks/pre-push
#!/bin/bash
# Prevent push to main
branch=$(git rev-parse --abbrev-ref HEAD)
if [ "$branch" = "main" ]; then
echo "Direct push to main is not allowed."
exit 1
fi
exit 0
2. Build Stage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
build:
stage: build
script:
# Build application
- docker build -t $IMAGE:$CI_COMMIT_SHA .
# Scan for vulnerabilities
- trivy image --severity HIGH,CRITICAL $IMAGE:$CI_COMMIT_SHA
# Push to registry
- docker push $IMAGE:$CI_COMMIT_SHA
# Update image tag in GitOps repo
- cd gitops-repo
- kustomize edit set image app=$IMAGE:$CI_COMMIT_SHA
- git commit -am "Update image to $CI_COMMIT_SHA"
- git push
3. Test Stage
Test Types:
1
2
3
4
5
6
7
8
test:
parallel:
matrix:
- TEST_TYPE: unit
- TEST_TYPE: integration
- TEST_TYPE: e2e
script:
- npm run test:$TEST_TYPE
Test Pyramid:
1
2
3
4
5
6
7
/\
/ \ E2E Tests (Few)
/____\
/ \ Integration Tests (Some)
/________\
/ \ Unit Tests (Many)
/____________\
4. Deploy Stage
GitOps Deploy (Update manifest, agent applies):
1
2
3
4
5
6
7
8
9
deploy:
stage: deploy
script:
- git clone https://gitlab.com/org/gitops-repo.git
- cd gitops-repo
- yq eval ".spec.template.spec.containers[0].image = \"$IMAGE:$TAG\"" -i deployment.yaml
- git add deployment.yaml
- git commit -m "Deploy $TAG to production"
- git push
Pipeline Best Practices
1. Fail Fast
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
jobs:
quick-checks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint
run: npm run lint
- name: Type check
run: npm run typecheck
expensive-tests:
needs: quick-checks # Only run if quick checks pass
runs-on: ubuntu-latest
steps:
- name: Integration tests
run: npm run test:integration
2. Cache Dependencies
1
2
3
4
5
6
7
- name: Cache node modules
uses: actions/cache@v3
with:
path: ~/.npm
key: $-node-$
restore-keys: |
$-node-
3. Parallel Execution
1
2
3
4
5
6
7
test:
strategy:
matrix:
suite: [unit, integration, e2e]
runs-on: ubuntu-latest
steps:
- run: npm run test:$
4. Conditional Execution
1
2
3
4
5
6
7
8
9
10
11
deploy-staging:
if: github.ref == 'refs/heads/develop'
runs-on: ubuntu-latest
steps:
- run: ./deploy.sh staging
deploy-prod:
if: github.event_name == 'release'
runs-on: ubuntu-latest
steps:
- run: ./deploy.sh production
5. Manual Approval Gates
1
2
3
4
5
6
7
8
deploy-production:
runs-on: ubuntu-latest
environment:
name: production
url: https://example.com
steps:
- run: ./deploy.sh
# Requires manual approval in GitHub
Infrastructure as Code
Terraform
Example Configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# main.tf
terraform {
required_version = ">= 1.0"
backend "s3" {
bucket = "terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
}
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.region
}
# VPC
module "vpc" {
source = "./modules/vpc"
cidr_block = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
tags = {
Environment = var.environment
ManagedBy = "Terraform"
}
}
# EKS Cluster
module "eks" {
source = "./modules/eks"
cluster_name = "my-cluster"
cluster_version = "1.28"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
node_groups = {
general = {
desired_capacity = 3
max_capacity = 10
min_capacity = 2
instance_types = ["t3.medium"]
labels = {
role = "general"
}
}
}
}
# Outputs
output "cluster_endpoint" {
value = module.eks.cluster_endpoint
}
output "cluster_name" {
value = module.eks.cluster_name
}
Terraform Workflow:
1
2
3
4
5
6
7
8
9
10
11
# Initialize
terraform init
# Plan changes
terraform plan -out=tfplan
# Apply changes
terraform apply tfplan
# Destroy resources
terraform destroy
GitOps with Terraform:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# .github/workflows/terraform.yml
name: Terraform
on:
push:
branches: [ main ]
paths:
- 'terraform/**'
pull_request:
branches: [ main ]
paths:
- 'terraform/**'
jobs:
terraform:
runs-on: ubuntu-latest
defaults:
run:
working-directory: terraform
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.6.0
- name: Terraform Init
run: terraform init
- name: Terraform Format
run: terraform fmt -check
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
run: terraform plan -no-color
continue-on-error: true
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve
Kubernetes Manifests
Plain YAML:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
version: v1.0.0
spec:
containers:
- name: app
image: myapp:v1.0.0
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: production
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
Kustomize
Directory Structure:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
k8s/
βββ base/
β βββ deployment.yaml
β βββ service.yaml
β βββ configmap.yaml
β βββ kustomization.yaml
βββ overlays/
βββ dev/
β βββ kustomization.yaml
β βββ patch-replicas.yaml
β βββ patch-resources.yaml
βββ staging/
β βββ kustomization.yaml
β βββ patch-replicas.yaml
βββ prod/
βββ kustomization.yaml
βββ patch-replicas.yaml
Base Configuration:
1
2
3
4
5
6
7
8
9
10
11
12
# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- configmap.yaml
commonLabels:
app: my-app
managedBy: kustomize
Dev Overlay:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# overlays/dev/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
namespace: dev
patches:
- patch-replicas.yaml
- patch-resources.yaml
images:
- name: myapp
newTag: dev-latest
configMapGenerator:
- name: app-config
behavior: merge
literals:
- LOG_LEVEL=debug
- ENVIRONMENT=development
1
2
3
4
5
6
7
# overlays/dev/patch-replicas.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
Build and Apply:
1
2
3
4
5
6
7
8
# Build kustomization
kustomize build overlays/dev
# Apply to cluster
kustomize build overlays/dev | kubectl apply -f -
# Or use kubectl directly
kubectl apply -k overlays/dev
Helm
Chart Structure:
1
2
3
4
5
6
7
8
9
10
11
my-app/
βββ Chart.yaml
βββ values.yaml
βββ templates/
β βββ deployment.yaml
β βββ service.yaml
β βββ ingress.yaml
β βββ configmap.yaml
β βββ secret.yaml
β βββ _helpers.tpl
βββ charts/
Chart.yaml:
1
2
3
4
5
6
apiVersion: v2
name: my-app
description: A Helm chart for my application
type: application
version: 1.0.0
appVersion: "1.0.0"
values.yaml:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
replicaCount: 3
image:
repository: myapp
tag: "1.0.0"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: app-tls
hosts:
- app.example.com
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
Template:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name:
labels:
spec:
replicas:
selector:
matchLabels:
template:
metadata:
labels:
spec:
containers:
- name:
image: ":"
imagePullPolicy:
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
Environment-Specific Values:
1
2
3
4
5
6
7
8
9
10
11
12
13
# values-dev.yaml
replicaCount: 1
image:
tag: dev-latest
resources:
requests:
cpu: 50m
memory: 64Mi
autoscaling:
enabled: false
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# values-prod.yaml
replicaCount: 5
image:
tag: "1.0.0"
resources:
requests:
cpu: 200m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
Helm Commands:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Install chart
helm install my-app ./my-app -f values-prod.yaml
# Upgrade
helm upgrade my-app ./my-app -f values-prod.yaml
# Rollback
helm rollback my-app 1
# Uninstall
helm uninstall my-app
# List releases
helm list
# Get values
helm get values my-app
Deployment Strategies
Rolling Update
Description: Gradually replace old pods with new ones.
1
2
3
4
5
6
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max pods above desired count
maxUnavailable: 0 # Max pods that can be unavailable
Pros:
- Zero downtime
- Gradual rollout
- Easy rollback
Cons:
- Both versions run simultaneously
- Slower than recreate
Blue-Green Deployment
Description: Run two identical environments, switch traffic between them.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Blue deployment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-blue
spec:
replicas: 3
selector:
matchLabels:
app: my-app
version: blue
---
# Green deployment (new)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-green
spec:
replicas: 3
selector:
matchLabels:
app: my-app
version: green
---
# Service points to active version
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
selector:
app: my-app
version: blue # Switch to 'green' to cutover
Pros:
- Instant rollback
- Zero downtime
- Full testing before cutover
Cons:
- Double resources required
- Database migrations complex
Canary Deployment
Description: Route small percentage of traffic to new version.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Stable version (90% of traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-stable
spec:
replicas: 9
---
# Canary version (10% of traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-canary
spec:
replicas: 1
Using Istio:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: my-app
spec:
hosts:
- my-app
http:
- match:
- headers:
canary:
exact: "true"
route:
- destination:
host: my-app
subset: canary
- route:
- destination:
host: my-app
subset: stable
weight: 90
- destination:
host: my-app
subset: canary
weight: 10
Pros:
- Reduced risk
- Real user testing
- Gradual rollout
Cons:
- Complex setup
- Monitoring required
- Longer deployment time
Security Best Practices
1. Secrets Management
Never Commit Secrets to Git:
1
2
3
4
5
6
# .gitignore
.env
secrets.yaml
*.pem
*.key
credentials.json
Use External Secret Stores:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Using External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: password
remoteRef:
key: prod/db/password
Sealed Secrets (Bitnami):
1
2
3
4
5
6
# Encrypt secret
kubeseal --format yaml < secret.yaml > sealed-secret.yaml
# Commit sealed secret to Git
git add sealed-secret.yaml
git commit -m "Add database credentials"
1
2
3
4
5
6
7
8
# sealed-secret.yaml (safe to commit)
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: db-credentials
spec:
encryptedData:
password: AgBHW3N2c3RoaW5nZW5jcnlwdGVkCg==
2. RBAC (Role-Based Access Control)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
namespace: production
rules:
- apiGroups: ["", "apps"]
resources: ["pods", "deployments"]
verbs: ["get", "list", "watch"]
---
# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: production
subjects:
- kind: User
name: jane.doe@example.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer
apiGroup: rbac.authorization.k8s.io
3. Pod Security
Pod Security Standards:
1
2
3
4
5
6
7
8
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Security Context:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:1.0.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
4. Network Policies
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
5. Image Security
Image Scanning:
1
2
3
4
5
6
7
8
9
10
11
12
13
# .github/workflows/security.yml
- name: Run Trivy scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: $:$
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
- name: Upload to GitHub Security
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
Image Signing (Cosign):
1
2
3
4
5
# Sign image
cosign sign --key cosign.key $IMAGE:$TAG
# Verify signature
cosign verify --key cosign.pub $IMAGE:$TAG
6. Audit Logging
1
2
3
4
5
6
7
8
9
10
11
12
# Enable audit logging in Kubernetes
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: RequestResponse
resources:
- group: "apps"
resources: ["deployments", "statefulsets"]
Monitoring and Observability
Metrics
Prometheus:
1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: ServiceMonitor
metadata:
name: my-app
spec:
selector:
matchLabels:
app: my-app
endpoints:
- port: metrics
interval: 30s
path: /metrics
Key Metrics:
- Application: Request rate, error rate, latency (RED)
- Infrastructure: CPU, memory, disk, network (USE)
- GitOps: Sync status, drift detection, reconciliation time
Logging
Structured Logging:
1
2
3
4
5
6
7
8
9
{
"timestamp": "2025-01-15T10:30:00Z",
"level": "info",
"message": "Deployment successful",
"service": "my-app",
"version": "v1.2.3",
"environment": "production",
"user": "jane.doe@example.com"
}
Log Aggregation:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Loki (Grafana)
- CloudWatch Logs (AWS)
Tracing
OpenTelemetry:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-config
data:
config.yaml: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
jaeger:
endpoint: jaeger:14250
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
Alerting
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# PrometheusRule
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: my-app-alerts
spec:
groups:
- name: my-app
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 10m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is requests/second"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod is crash looping"
GitOps Tools
ArgoCD
Installation:
1
2
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
Application Definition:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/gitops-repo.git
targetRevision: HEAD
path: apps/my-app/overlays/prod
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
ignoreDifferences:
- group: apps
kind: Deployment
jsonPointers:
- /spec/replicas
ArgoCD CLI:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Login
argocd login argocd.example.com
# List applications
argocd app list
# Get application details
argocd app get my-app
# Sync application
argocd app sync my-app
# Rollback
argocd app rollback my-app 0
Flux
Installation:
1
2
3
4
5
6
flux bootstrap github \
--owner=myorg \
--repository=fleet-infra \
--branch=main \
--path=./clusters/production \
--personal
GitRepository:
1
2
3
4
5
6
7
8
9
10
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m
url: https://github.com/org/my-app
ref:
branch: main
Kustomization:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: my-app
namespace: flux-system
spec:
interval: 5m
path: ./k8s/overlays/prod
prune: true
sourceRef:
kind: GitRepository
name: my-app
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: my-app
namespace: production
Flux CLI:
1
2
3
4
5
6
7
8
9
10
11
12
# Check Flux components
flux check
# Get kustomizations
flux get kustomizations
# Reconcile
flux reconcile kustomization my-app
# Suspend/resume
flux suspend kustomization my-app
flux resume kustomization my-app
Jenkins X
Installation:
1
jx boot
Pipeline Configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# jenkins-x.yml
buildPack: none
pipelineConfig:
pipelines:
release:
pipeline:
stages:
- name: build
steps:
- sh: docker build -t $DOCKER_REGISTRY/$APP_NAME:$VERSION .
- name: test
steps:
- sh: make test
- name: deploy
steps:
- sh: jx step helm apply
Comparison
| Feature | ArgoCD | Flux | Jenkins X |
|---|---|---|---|
| UI | β Rich Web UI | β Limited | β Web UI |
| Multi-cluster | β Native | β Via Git repos | β Native |
| Helm Support | β Full | β Full | β Native |
| Kustomize Support | β Full | β Full | β Via plugin |
| SSO | β OIDC, LDAP | β | β OAuth |
| RBAC | β Fine-grained | β K8s RBAC | β K8s RBAC |
| Notifications | β Slack, Email | β Slack, Email | β Multiple |
| CI Integration | β Any CI | β Any CI | β Built-in |
| Learning Curve | Medium | Low | High |
Best Practices
1. Git Repository Organization
Separate Concerns:
- Application code repository
- Infrastructure repository
- Configuration repository
Benefits:
- Different lifecycles
- Different teams
- Different security requirements
- Different approval processes
1
2
3
4
org/
βββ app-user-service/ # Application code
βββ infrastructure/ # Terraform, CloudFormation
βββ gitops-configs/ # K8s manifests, Helm values
2. Environment Management
Use Directories, Not Branches:
1
2
3
4
5
6
configs/
βββ base/ # Common configuration
βββ environments/
βββ dev/
βββ staging/
βββ prod/
Environment Promotion:
1
2
3
4
5
6
# Promote staging to prod
git diff environments/staging environments/prod
git checkout environments/staging -- app-config.yaml
mv app-config.yaml environments/prod/
git add environments/prod/
git commit -m "Promote staging config to prod"
3. Declarative Configuration
Always Use Declarative Syntax:
1
2
3
4
5
6
7
8
# Good - Declarative
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
# Bad - Imperative
# kubectl scale deployment my-app --replicas=3
4. Version Everything
Tag Releases:
1
2
git tag -a v1.2.3 -m "Release version 1.2.3"
git push origin v1.2.3
Semantic Versioning:
- MAJOR.MINOR.PATCH (1.2.3)
- MAJOR: Breaking changes
- MINOR: New features (backward compatible)
- PATCH: Bug fixes
5. Automated Testing
Test Infrastructure Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# .github/workflows/terraform-test.yml
name: Terraform Test
on:
pull_request:
paths:
- 'terraform/**'
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Terraform Format
run: terraform fmt -check -recursive
- name: Terraform Validate
run: |
terraform init -backend=false
terraform validate
- name: TFLint
uses: terraform-linters/setup-tflint@v3
- name: Run TFLint
run: tflint --recursive
- name: Checkov Security Scan
uses: bridgecrewio/checkov-action@master
with:
directory: terraform/
Test Kubernetes Manifests:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# .github/workflows/k8s-test.yml
name: Kubernetes Manifest Test
on:
pull_request:
paths:
- 'k8s/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup tools
run: |
curl -s https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh | bash
sudo snap install kubeconform
- name: Validate with kustomize
run: |
kustomize build k8s/overlays/prod > output.yaml
- name: Validate with kubeconform
run: |
kubeconform -summary -output json output.yaml
- name: Policy check with OPA
uses: open-policy-agent/opa-action@v2
with:
tests: policies/
6. Security Practices
Scan for Secrets:
1
2
3
4
- name: Gitleaks scan
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: $
Sign Commits:
1
2
3
4
5
6
# Configure GPG
git config --global user.signingkey YOUR_GPG_KEY
git config --global commit.gpgsign true
# Sign commits
git commit -S -m "Add deployment configuration"
Verify Commits:
1
git verify-commit HEAD
7. Documentation
README Template:
1
2
3
4
5
6
7
8
9
# Project Name
## Overview
Brief description of the project and its purpose.
## Architecture
High-level architecture diagram and explanation.
## Repository Structure
project/ βββ apps/ # Application manifests βββ infrastructure/ # Infrastructure code βββ docs/ # Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## Prerequisites
- Kubernetes 1.25+
- kubectl
- kustomize
## Deployment
Step-by-step deployment instructions.
## Monitoring
Links to dashboards and monitoring tools.
## Troubleshooting
Common issues and solutions.
## Contributing
Contribution guidelines.
8. Rollback Strategy
Keep Rollback Simple:
1
2
3
4
5
6
7
8
9
# With ArgoCD
argocd app rollback my-app 0
# With Flux
flux reconcile kustomization my-app --with-source
# With Git
git revert HEAD
git push
Test Rollback Procedures:
- Practice rollbacks regularly
- Automate rollback triggers
- Monitor rollback success
9. Change Management
Pull Request Template:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## Description
What does this PR do?
## Type of Change
- [ ] New feature
- [ ] Bug fix
- [ ] Configuration change
- [ ] Infrastructure change
## Impact Analysis
- [ ] Affects production
- [ ] Requires downtime
- [ ] Breaking change
- [ ] Rollback plan documented
## Testing
- [ ] Tested in dev
- [ ] Tested in staging
- [ ] Load testing completed
- [ ] Security review completed
## Deployment Plan
Step-by-step deployment instructions
## Rollback Plan
Step-by-step rollback instructions
## Checklist
- [ ] Documentation updated
- [ ] Monitoring alerts configured
- [ ] Team notified
10. Observability
Monitor GitOps Health:
1
2
3
4
5
# Prometheus metrics
- argocd_app_sync_total
- argocd_app_health_status
- gitops_runtime_reconcile_duration_seconds
- flux_reconcile_duration_seconds
Dashboard Metrics:
- Sync success rate
- Time to sync
- Drift detection count
- Failed reconciliations
- Deployment frequency
- Lead time for changes
- Mean time to recovery (MTTR)
11. Disaster Recovery
Backup Strategy:
1
2
3
4
5
# Backup cluster state
kubectl get all --all-namespaces -o yaml > cluster-backup.yaml
# Backup ArgoCD applications
argocd app list -o yaml > argocd-apps-backup.yaml
Recovery Plan:
- Restore infrastructure (Terraform)
- Deploy GitOps operator
- Apply application definitions
- Verify sync status
12. Progressive Delivery
Canary with Flagger:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-app
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
service:
port: 80
analysis:
interval: 1m
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
webhooks:
- name: load-test
url: http://load-tester.test/
timeout: 5s
Common Pitfalls
1. Committing Secrets to Git
Problem: Secrets accidentally committed to repository.
Solution:
- Use
.gitignore - Use git-secrets or gitleaks
- Use external secret management
- Rotate exposed secrets immediately
1
2
3
4
5
6
# Install git-secrets
git secrets --install
git secrets --register-aws
# Scan repository
git secrets --scan
2. Direct Cluster Modifications
Problem: Manual kubectl commands bypass GitOps.
Solution:
- Enforce RBAC policies
- Use admission controllers
- Audit cluster changes
- Educate team on GitOps workflow
1
2
3
4
5
6
7
# OPA Policy: Deny manual changes
package kubernetes.admission
deny[msg] {
input.request.userInfo.username != "system:serviceaccount:flux-system:flux"
msg := "Manual changes not allowed. Use GitOps."
}
3. Not Testing Before Merge
Problem: Broken configurations merged to main.
Solution:
- Require CI checks to pass
- Use branch protection
- Enable preview environments
1
2
3
4
5
6
# Branch protection
main:
required_status_checks:
- validate-manifests
- security-scan
required_reviews: 2
4. Ignoring Drift
Problem: Actual state diverges from desired state.
Solution:
- Enable auto-sync
- Monitor drift metrics
- Set up alerts
1
2
3
4
5
# ArgoCD auto-sync
syncPolicy:
automated:
prune: true
selfHeal: true
5. Poor Repository Structure
Problem: Difficult to navigate and maintain.
Solution:
- Follow consistent structure
- Document organization
- Use clear naming conventions
6. Missing Rollback Plan
Problem: No clear way to revert changes.
Solution:
- Document rollback procedures
- Practice rollbacks
- Keep rollback simple (git revert)
7. Inadequate Monitoring
Problem: Donβt know when deployments fail.
Solution:
- Monitor GitOps metrics
- Set up alerts
- Integrate with incident management
8. Over-Complicated Pipelines
Problem: Complex pipelines are hard to maintain.
Solution:
- Keep pipelines simple
- Use reusable workflows
- Document complex logic
9. Lack of Documentation
Problem: Team doesnβt understand workflows.
Solution:
- Document processes
- Create runbooks
- Provide training
10. Not Using Environments Properly
Problem: Testing directly in production.
Solution:
- Use dev/staging/prod environments
- Test in lower environments first
- Automate promotion
Jargon Tables
Table 1: GitOps Lifecycle Terminology
| GitOps Term | Alternative Terms | Definition | Context |
|---|---|---|---|
| Desired State | Target state, intended state | Configuration stored in Git | What you want |
| Actual State | Current state, live state, runtime state | Current configuration in cluster | What you have |
| Reconciliation | Sync, convergence, drift correction | Process of aligning actual with desired | Continuous process |
| Drift | Configuration drift, state divergence | Difference between desired and actual state | Problem detection |
| Sync | Synchronization, apply, deploy | Update actual state to match desired | Action |
| Pull-based | Agent-based, operator pattern | Agent pulls changes from Git | GitOps model |
| Push-based | Traditional CI/CD, pipeline deploy | Pipeline pushes to cluster | Traditional model |
| Declarative | Descriptive, state-based | Define what you want, not how | Configuration style |
| Imperative | Procedural, command-based | Define how to achieve state | Traditional approach |
| Manifest | Configuration file, resource definition | YAML/JSON describing resources | K8s terminology |
| GitOps Agent | Operator, controller, reconciler | Software monitoring and applying changes | ArgoCD, Flux |
| Source of Truth | Single source, canonical source | Authoritative configuration location | Git repository |
| Auto-sync | Automated sync, continuous deployment | Automatic application of changes | GitOps feature |
| Self-heal | Auto-remediation, drift correction | Automatic correction of manual changes | GitOps feature |
| Prune | Cleanup, deletion | Remove resources not in desired state | GitOps operation |
Table 2: Git Operations Terminology
| Git Term | Alternative Terms | Definition | Common Commands |
|---|---|---|---|
| Repository | Repo, project | Directory with Git history | git init, git clone |
| Commit | Revision, snapshot, changeset | Saved state of repository | git commit |
| Branch | Line of development | Parallel version of code | git branch, git checkout |
| Merge | Integration, combine | Integrate changes from branches | git merge |
| Pull Request | PR, merge request (GitLab) | Request to merge changes | GitHub/GitLab UI |
| Tag | Release tag, version tag | Named reference to commit | git tag |
| Push | Upload, publish | Send commits to remote | git push |
| Pull | Download, fetch+merge | Get changes from remote | git pull |
| Fetch | Retrieve, download | Get remote changes without merge | git fetch |
| Rebase | Reapply, replay commits | Move commits to new base | git rebase |
| Cherry-pick | Select commit | Apply specific commit | git cherry-pick |
| Stash | Temporary save | Save uncommitted changes | git stash |
| Reset | Undo, rewind | Move HEAD to different commit | git reset |
| Revert | Reverse, undo commit | Create new commit undoing changes | git revert |
| Remote | Repository URL | Remote repository reference | git remote |
Table 3: CI/CD Pipeline Stages
| Stage | Alternative Names | Purpose | Common Tools |
|---|---|---|---|
| Source | Code checkout, clone | Get code from repository | Git, GitHub, GitLab |
| Build | Compile, package | Create deployable artifacts | Docker, Maven, npm |
| Test | Validation, quality check | Verify code quality | Jest, pytest, JUnit |
| Security Scan | SAST, vulnerability scan | Identify security issues | Trivy, Snyk, SonarQube |
| Artifact Storage | Registry, repository | Store build artifacts | Docker Hub, ECR, Nexus |
| Deploy | Release, rollout | Deploy to environment | ArgoCD, Flux, Helm |
| Verify | Smoke test, health check | Confirm deployment success | curl, k8s probes |
| Promote | Environment progression | Move between environments | Git operations |
Table 4: Hierarchical GitOps Architecture
| Level | Component | Sub-Component | Purpose | Tools |
|---|---|---|---|---|
| 1 | Source Control | Β | Version control system | Git |
| Β | Β | Repository | Store configurations | GitHub, GitLab |
| Β | Β | Branch | Parallel development | Git branches |
| Β | Β | Pull Request | Code review mechanism | GitHub PR, GitLab MR |
| 2 | CI Pipeline | Β | Continuous Integration | GitHub Actions, GitLab CI |
| Β | Β | Build | Create artifacts | Docker build |
| Β | Β | Test | Validation | pytest, jest |
| Β | Β | Security | Vulnerability scanning | Trivy, Snyk |
| 3 | Artifact Registry | Β | Store build outputs | Container registries |
| Β | Β | Container Images | Docker images | Docker Hub, ECR, GCR |
| Β | Β | Helm Charts | K8s packages | Helm registry |
| 4 | GitOps Operator | Β | Sync engine | ArgoCD, Flux |
| Β | Β | Source Controller | Monitor Git repos | Flux Source Controller |
| Β | Β | Sync Controller | Apply changes | ArgoCD Application Controller |
| Β | Β | Health Assessment | Check resource status | Health checks |
| 5 | Target Environment | Β | Deployment destination | Kubernetes |
| Β | Β | Cluster | K8s cluster | EKS, GKE, AKS |
| Β | Β | Namespace | Logical separation | K8s namespaces |
| Β | Β | Workloads | Running applications | Deployments, StatefulSets |
Table 5: Deployment Strategy Comparison
| Strategy | Speed | Risk | Downtime | Resource Cost | Rollback Speed | Use Case |
|---|---|---|---|---|---|---|
| Recreate | Fast | High | Yes | Low | Slow | Dev environments |
| Rolling Update | Medium | Medium | No | Low | Medium | Most applications |
| Blue-Green | Instant | Low | No | High (2x) | Instant | Critical services |
| Canary | Slow | Low | No | Medium | Fast | High-risk changes |
| A/B Testing | Slow | Low | No | Medium | N/A | Feature testing |
Table 6: GitOps Tool Comparison
| Feature | ArgoCD | Flux | Jenkins X | Spinnaker |
|---|---|---|---|---|
| Architecture | Controller | Operator | Platform | Pipeline |
| UI | β Rich | β οΈ Basic | β Good | β Rich |
| Multi-tenant | β Native | β Via namespaces | β Native | β Native |
| Helm Support | β Full | β Full | β Native | β Full |
| Kustomize | β Native | β Native | β Plugin | β Plugin |
| SSO/OIDC | β Yes | β No | β Yes | β Yes |
| RBAC | β Fine-grained | β K8s RBAC | β K8s RBAC | β Fine-grained |
| Webhook Events | β Yes | β Yes | β Yes | β Yes |
| Notifications | β Multiple | β Multiple | β Multiple | β Multiple |
| Progressive Delivery | β οΈ Via Argo Rollouts | β Via Flagger | β No | β Native |
| Learning Curve | Medium | Low | High | High |
| Community | Large | Large | Medium | Large |
Table 7: Infrastructure as Code Tools
| Tool | Language | Cloud Support | State Management | Use Case |
|---|---|---|---|---|
| Terraform | HCL | Multi-cloud | Remote backends | Universal IaC |
| Pulumi | TypeScript, Python, Go | Multi-cloud | Cloud storage | Code-first IaC |
| CloudFormation | YAML/JSON | AWS only | AWS managed | AWS native |
| Ansible | YAML | Multi-cloud | Stateless | Configuration management |
| Helm | YAML + Templates | Kubernetes | In-cluster | K8s packages |
| Kustomize | YAML + Overlays | Kubernetes | Stateless | K8s configuration |
Table 8: Security Components in GitOps
| Component | Purpose | Tools | Integration Point |
|---|---|---|---|
| Secret Management | Secure credentials | Sealed Secrets, External Secrets | Git repository |
| Image Scanning | Vulnerability detection | Trivy, Snyk, Clair | CI pipeline |
| Policy Enforcement | Compliance checks | OPA, Kyverno, Gatekeeper | Admission controller |
| RBAC | Access control | K8s RBAC, IAM | Cluster |
| Network Policies | Traffic control | Calico, Cilium | Kubernetes |
| Audit Logging | Change tracking | K8s audit, Git history | Multiple |
| Signing | Artifact verification | Cosign, Notary | Container registry |
| SAST | Code analysis | SonarQube, CodeQL | CI pipeline |
Complete GitOps Workflow Example
Scenario: Deploy New Application Version
Step 1: Developer Makes Changes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Create feature branch
git checkout -b feature/update-version
# Update application code
vim src/app.py
# Update Docker image version
vim k8s/base/deployment.yaml
# Commit changes
git add .
git commit -m "feat: update application to v1.2.0"
# Push to remote
git push origin feature/update-version
Step 2: Create Pull Request
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# GitHub Actions runs automatically
name: CI Pipeline
on:
pull_request:
branches: [ main ]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint code
run: npm run lint
- name: Run tests
run: npm test
- name: Build Docker image
run: docker build -t myapp:pr-$ .
- name: Scan image
run: trivy image myapp:pr-$
- name: Validate K8s manifests
run: kustomize build k8s/overlays/prod | kubeconform -
Step 3: Code Review and Approval
1
2
3
4
5
6
# PR Review Checklist
- [ ] Code follows style guidelines
- [ ] Tests pass
- [ ] Security scan clean
- [ ] Documentation updated
- [ ] Approved by 2 reviewers
Step 4: Merge to Main
1
2
3
4
# After approval, merge PR
git checkout main
git merge feature/update-version
git push origin main
Step 5: CI/CD Pipeline Builds and Pushes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
name: Build and Deploy
on:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t myapp:$ .
- name: Push to registry
run: docker push myapp:$
- name: Update GitOps repo
run: |
git clone https://github.com/org/gitops-repo.git
cd gitops-repo/k8s/overlays/prod
kustomize edit set image myapp=myapp:$
git commit -am "Deploy myapp:$"
git push
Step 6: GitOps Agent Syncs
1
2
3
4
5
6
7
8
9
# ArgoCD detects change in Git
# Reconciliation loop:
# 1. Fetch latest from Git
# 2. Compare with cluster state
# 3. Apply differences
# 4. Monitor health
# View sync status
argocd app get myapp
Step 7: Verification
1
2
3
4
5
6
7
8
9
10
11
# Check deployment
kubectl get deployment myapp -n production
# Check pods
kubectl get pods -n production -l app=myapp
# Check logs
kubectl logs -n production -l app=myapp --tail=100
# Verify health
curl https://myapp.example.com/health
Step 8: Monitoring
1
2
3
4
5
6
7
8
# Prometheus alerts fire if issues detected
- alert: DeploymentFailed
expr: kube_deployment_status_replicas_unavailable > 0
for: 5m
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 10m
Step 9: Rollback (if needed)
1
2
3
4
5
6
7
8
9
# Option 1: Git revert
git revert HEAD
git push
# Option 2: ArgoCD rollback
argocd app rollback myapp 0
# Option 3: Manual kubectl
kubectl rollout undo deployment/myapp -n production
References
Summary
GitOps is a powerful operational framework that leverages Git as the single source of truth for declarative infrastructure and applications. By treating infrastructure and application configuration as code stored in Git repositories, teams can achieve:
Key Benefits
- Increased Velocity: Faster deployments with automated pipelines
- Improved Stability: Declarative configurations reduce errors
- Enhanced Security: Audit trails, RBAC, and secret management
- Better Collaboration: Git-based workflows enable code review
- Disaster Recovery: Complete system state in Git enables easy restoration
- Compliance: Full audit trail of all changes
Core Principles
- Declarative: Systemβs desired state described declaratively
- Versioned: All configuration stored in Git with full history
- Automated: Software agents automatically apply desired state
- Reconciled: Continuous monitoring and drift correction
Essential Components
- Git: Version control and source of truth
- CI/CD: Automated pipelines for building and testing
- GitOps Agent: ArgoCD, Flux, or similar tools
- Kubernetes: Target platform for deployments
- IaC Tools: Terraform, Helm, Kustomize
Best Practices
- Separate application code from configuration
- Use directories, not branches, for environments
- Implement comprehensive testing
- Secure secrets with external management
- Monitor GitOps metrics and health
- Document processes and maintain runbooks
- Practice rollback procedures regularly
GitOps represents a paradigm shift in how we manage and deploy applications, bringing the best practices of software development to operations. By embracing GitOps, teams can build more reliable, secure, and scalable systems. What is GitOps?
GitOps treats infrastructure and application configuration as code, stored in Git repositories. All changes to infrastructure and applications are made through Git commits and pull requests, triggering automated processes that synchronize the desired state (in Git) with the actual state (in production).
Key Characteristics
- Declarative: Define the desired state of your system rather than imperative instructions
- Versioned and Immutable: All changes are tracked in Git with complete history
- Automatically Applied: Automated agents continuously reconcile actual state with desired state
- Continuously Reconciled: Systems self-heal by detecting and correcting drift
When to Use GitOps
β Ideal For:
- Kubernetes and container orchestration
- Cloud-native applications
- Microservices architectures
- Infrastructure as Code (IaC) deployments
- Multi-environment management
- Teams requiring audit trails and compliance
β Not Ideal For:
- Legacy monolithic applications without automation
- Simple static websites
- One-off deployments without version control needs
Core Concepts
Single Source of Truth
Git serves as the canonical source for both application code and infrastructure configuration. Every aspect of your systemβs desired state is stored in Git repositories.
Benefits:
- Complete audit trail of all changes
- Easy rollback to any previous state
- Clear separation of concerns
- Disaster recovery capabilities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Example: Kubernetes deployment stored in Git
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-application
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: my-application
template:
metadata:
labels:
app: my-application
spec:
containers:
- name: app
image: myapp:v1.2.3
ports:
- containerPort: 8080
Declarative Configuration
Describe what you want, not how to achieve it. The system determines the necessary steps to reach the desired state.
Imperative vs Declarative:
1
2
3
4
5
6
7
8
# Imperative (how to do it)
kubectl create namespace production
kubectl create deployment my-app --image=myapp:1.0
kubectl scale deployment my-app --replicas=3
kubectl expose deployment my-app --port=8080
# Declarative (what you want)
kubectl apply -f production-deployment.yaml
Continuous Reconciliation
Automated agents continuously monitor the actual state and compare it with the desired state in Git. Any drift is automatically corrected.
Reconciliation Loop:
- Observe: Monitor actual state of infrastructure
- Compare: Check against desired state in Git
- Detect Drift: Identify differences
- Remediate: Automatically apply changes to align states
- Repeat: Continuously monitor
Pull vs Push Deployment
Traditional Push Model (CI/CD):
- CI/CD pipeline pushes changes to production
- Requires cluster credentials in CI/CD system
- Pipeline has write access to production
GitOps Pull Model:
- Agent inside cluster pulls changes from Git
- No external system needs cluster access
- Improved security posture
- Self-healing capabilities
1
2
3
4
5
6
7
8
9
10
11
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Developer ββββββββββΆβ Git Repo βββββββββββ GitOps β
β β commit β β pull β Agent β
βββββββββββββββ βββββββββββββββ ββββββββ¬βββββββ
β
β apply
βΌ
βββββββββββββββ
β Kubernetes β
β Cluster β
βββββββββββββββ
GitOps Principles
1. Declarative Description
The entire systemβs desired state is described declaratively in a format that machines can parse and understand (YAML, JSON, HCL, etc.).
Example - Terraform Configuration:
1
2
3
4
5
6
7
8
9
resource "aws_instance" "web_server" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
tags = {
Name = "WebServer"
Environment = "Production"
}
}
2. Versioned and Immutable
All desired state is stored in a version control system that provides versioning, immutability, and audit trails.
Git Provides:
- Complete change history
- Author attribution
- Timestamps
- Commit messages explaining changes
- Ability to revert to any previous state
1
2
3
4
5
6
7
8
# View change history
git log --oneline --graph
# See what changed
git diff HEAD~1 deployment.yaml
# Revert to previous version
git revert abc123
3. Pulled Automatically
Software agents automatically pull the desired state declarations from Git and apply them to the infrastructure.
GitOps Agents:
- ArgoCD: Kubernetes-native continuous delivery
- Flux: GitOps operator for Kubernetes
- Jenkins X: Cloud-native CI/CD for Kubernetes
- Terraform Cloud: IaC automation platform
4. Continuously Reconciled
Software agents continuously observe actual system state and attempt to apply the desired state.
Drift Detection and Correction:
1
2
3
4
5
6
# Desired state in Git: 3 replicas
spec:
replicas: 3
# Actual state: 5 replicas (manually scaled)
# GitOps agent detects drift and corrects to 3 replicas
Git Fundamentals
Git Basics
Git is a distributed version control system that tracks changes in source code during software development.
Key Concepts
Repository: Directory containing your project files and Git metadata
1
2
3
4
5
# Initialize new repository
git init
# Clone existing repository
git clone https://github.com/username/repo.git
Commit: Snapshot of your repository at a specific point in time
1
2
3
4
5
6
7
8
# Stage files
git add filename.yaml
# Commit with message
git commit -m "Add production deployment configuration"
# View commit history
git log
Branch: Parallel version of your repository
1
2
3
4
5
6
7
8
9
10
11
# Create new branch
git branch feature/new-deployment
# Switch to branch
git checkout feature/new-deployment
# Create and switch in one command
git checkout -b feature/new-deployment
# List branches
git branch -a
Merge: Integrate changes from one branch into another
1
2
3
# Merge feature branch into main
git checkout main
git merge feature/new-deployment
Tag: Named reference to a specific commit (often used for releases)
1
2
3
4
5
6
7
8
# Create tag
git tag -a v1.0.0 -m "Release version 1.0.0"
# Push tags to remote
git push origin --tags
# List tags
git tag -l
Git Workflow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 1. Update local repository
git pull origin main
# 2. Create feature branch
git checkout -b feature/update-deployment
# 3. Make changes to files
vim deployment.yaml
# 4. Stage changes
git add deployment.yaml
# 5. Commit changes
git commit -m "Update deployment replicas to 5"
# 6. Push to remote
git push origin feature/update-deployment
# 7. Create pull request (on GitHub/GitLab)
# 8. Review and merge
# 9. Delete feature branch
git branch -d feature/update-deployment
Git Configuration
1
2
3
4
5
6
7
8
9
10
11
12
# Set user information
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
# Set default branch name
git config --global init.defaultBranch main
# Configure editor
git config --global core.editor "vim"
# View configuration
git config --list
Git Best Practices
Commit Messages
Good Commit Messages:
1
2
3
4
5
Add Kubernetes deployment for user service
- Configure 3 replicas for high availability
- Set resource limits: 500m CPU, 512Mi memory
- Add health checks on /health endpoint
Bad Commit Messages:
1
2
3
update
fix stuff
changes
Conventional Commits Format:
1
2
3
4
5
<type>(<scope>): <subject>
<body>
<footer>
Types:
feat: New featurefix: Bug fixdocs: Documentation changesstyle: Code style changes (formatting)refactor: Code refactoringtest: Adding testschore: Maintenance tasks
Example:
1
2
3
4
5
6
7
feat(deployment): add horizontal pod autoscaling
Configure HPA to scale between 3-10 replicas based on CPU
utilization target of 70%. This improves application availability
during traffic spikes.
Closes #123
Branching Hygiene
1
2
3
4
5
6
7
8
9
# Keep branches short-lived
# Delete merged branches
git branch -d feature/completed-feature
# Prune remote-tracking branches
git fetch --prune
# Clean up old branches
git branch --merged | grep -v "\*" | xargs -n 1 git branch -d
GitHub and GitHub Actions
GitHub Basics
GitHub is a web-based hosting service for Git repositories with collaboration features.
Repository Structure
1
2
3
4
5
6
7
8
9
10
11
my-project/
βββ .github/
β βββ workflows/ # GitHub Actions workflows
β β βββ ci.yml
β β βββ deploy.yml
β βββ CODEOWNERS # Code review assignments
β βββ dependabot.yml # Dependency updates
βββ .gitignore # Files to ignore
βββ README.md # Project documentation
βββ LICENSE # License file
βββ src/ # Application code
GitHub Features for GitOps
Pull Requests: Code review and collaboration mechanism
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Example: PR template (.github/pull_request_template.md)
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed
## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
Protected Branches: Enforce code quality standards
1
2
3
4
5
6
7
8
9
10
11
12
# Branch protection rules
main:
required_reviews: 2
require_code_owner_review: true
dismiss_stale_reviews: true
require_status_checks: true
required_status_checks:
- ci/lint
- ci/test
- ci/security-scan
enforce_admins: true
restrict_pushes: true
Code Owners: Automatic reviewer assignment
1
2
3
4
5
6
7
8
9
10
11
12
13
# .github/CODEOWNERS
# Global owners
* @team-leads
# Infrastructure files
/terraform/ @platform-team @sre-team
/kubernetes/ @platform-team
# Application code
/src/ @dev-team
# Documentation
/docs/ @doc-team @dev-team
GitHub Actions
GitHub Actions is a CI/CD platform integrated directly into GitHub.
Workflow Components
Workflow: Automated process defined in YAML
Event: Triggers that start workflows (push, pull_request, schedule, etc.)
Job: Set of steps executed on the same runner
Step: Individual task (run command, use action)
Action: Reusable unit of code
Runner: Server that executes workflows
Basic Workflow Structure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# .github/workflows/ci.yml
name: CI Pipeline
# Events that trigger workflow
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
# Environment variables
env:
NODE_VERSION: '18'
# Jobs to run
jobs:
test:
name: Run Tests
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: $
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
- name: Run tests
run: npm test
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage/coverage.xml
Advanced Workflow Features
Matrix Builds: Test across multiple configurations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
jobs:
test:
runs-on: $
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
node-version: [16, 18, 20]
exclude:
- os: macos-latest
node-version: 16
steps:
- uses: actions/checkout@v4
- name: Setup Node.js $
uses: actions/setup-node@v4
with:
node-version: $
- run: npm ci
- run: npm test
Conditional Execution:
1
2
3
4
5
6
7
8
steps:
- name: Deploy to production
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: ./deploy.sh production
- name: Deploy to staging
if: github.ref == 'refs/heads/develop'
run: ./deploy.sh staging
Secrets Management:
1
2
3
4
5
6
7
8
9
steps:
- name: Deploy application
env:
AWS_ACCESS_KEY_ID: $
AWS_SECRET_ACCESS_KEY: $
run: |
aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID
aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY
./deploy.sh
Caching Dependencies:
1
2
3
4
5
6
7
8
9
10
11
12
13
steps:
- uses: actions/checkout@v4
- name: Cache dependencies
uses: actions/cache@v3
with:
path: ~/.npm
key: $-node-$
restore-keys: |
$-node-
- name: Install dependencies
run: npm ci
Reusable Workflows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# .github/workflows/reusable-deploy.yml
name: Reusable Deploy Workflow
on:
workflow_call:
inputs:
environment:
required: true
type: string
version:
required: true
type: string
secrets:
deploy-key:
required: true
jobs:
deploy:
runs-on: ubuntu-latest
environment: $
steps:
- uses: actions/checkout@v4
- name: Deploy
run: ./deploy.sh $ $
env:
DEPLOY_KEY: $
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# .github/workflows/main.yml
name: Deploy Application
on:
push:
branches: [ main ]
jobs:
deploy-staging:
uses: ./.github/workflows/reusable-deploy.yml
with:
environment: staging
version: $
secrets:
deploy-key: $
deploy-production:
needs: deploy-staging
uses: ./.github/workflows/reusable-deploy.yml
with:
environment: production
version: $
secrets:
deploy-key: $
Parallel Jobs:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm run lint
test-unit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm run test:unit
test-integration:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm run test:integration
deploy:
needs: [lint, test-unit, test-integration]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: ./deploy.sh
Service Containers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- name: Run tests
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/testdb
REDIS_URL: redis://localhost:6379
run: npm test
Self-Hosted Runners:
1
2
3
4
5
6
7
jobs:
build:
runs-on: [self-hosted, linux, x64, gpu]
steps:
- uses: actions/checkout@v4
- name: Build with GPU
run: ./build-with-cuda.sh
Complete CI/CD Workflow Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
name: Complete CI/CD Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
release:
types: [published]
env:
REGISTRY: ghcr.io
IMAGE_NAME: $
jobs:
# Code quality checks
lint:
name: Lint Code
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run ESLint
run: npm run lint
- name: Run Prettier
run: npm run format:check
# Security scanning
security:
name: Security Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload to GitHub Security
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# Unit and integration tests
test:
name: Run Tests
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [16, 18, 20]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js $
uses: actions/setup-node@v4
with:
node-version: $
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test -- --coverage
- name: Upload coverage
if: matrix.node-version == '18'
uses: codecov/codecov-action@v3
# Build and push Docker image
build:
name: Build Image
needs: [lint, security, test]
runs-on: ubuntu-latest
if: github.event_name != 'pull_request'
permissions:
contents: read
packages: write
outputs:
image-tag: $
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: $
username: $
password: $
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: $/$
tags: |
type=ref,event=branch
type=semver,pattern=
type=semver,pattern=.
type=sha
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: $
labels: $
cache-from: type=gha
cache-to: type=gha,mode=max
# Deploy to staging
deploy-staging:
name: Deploy to Staging
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/develop'
environment:
name: staging
url: https://staging.example.com
steps:
- uses: actions/checkout@v4
- name: Update Kubernetes manifests
run: |
cd k8s/staging
kustomize edit set image app=$
- name: Commit changes
run: |
git config user.name github-actions
git config user.email github-actions@github.com
git add .
git commit -m "Deploy $ to staging"
git push
# Deploy to production
deploy-production:
name: Deploy to Production
needs: build
runs-on: ubuntu-latest
if: github.event_name == 'release'
environment:
name: production
url: https://example.com
steps:
- uses: actions/checkout@v4
- name: Update Kubernetes manifests
run: |
cd k8s/production
kustomize edit set image app=$
- name: Commit changes
run: |
git config user.name github-actions
git config user.email github-actions@github.com
git add .
git commit -m "Deploy $ to production"
git push
GitLab CI/CD
GitLab Overview
GitLab is a complete DevOps platform with built-in CI/CD capabilities.
GitLab CI/CD Configuration
GitLab uses .gitlab-ci.yml file in the repository root.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# .gitlab-ci.yml
stages:
- build
- test
- deploy
variables:
DOCKER_REGISTRY: registry.gitlab.com
IMAGE_NAME: $CI_REGISTRY_IMAGE
# Build stage
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker build -t $IMAGE_NAME:$CI_COMMIT_SHA .
- docker push $IMAGE_NAME:$CI_COMMIT_SHA
only:
- main
- develop
# Test stage
test:
stage: test
image: node:18
cache:
paths:
- node_modules/
script:
- npm ci
- npm run lint
- npm test
coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
artifacts:
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
# Deploy to staging
deploy-staging:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/my-app app=$IMAGE_NAME:$CI_COMMIT_SHA
- kubectl rollout status deployment/my-app
environment:
name: staging
url: https://staging.example.com
only:
- develop
# Deploy to production
deploy-production:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/my-app app=$IMAGE_NAME:$CI_COMMIT_SHA
- kubectl rollout status deployment/my-app
environment:
name: production
url: https://example.com
when: manual
only:
- main
GitLab Features
Auto DevOps: Automated CI/CD pipeline
Container Registry: Built-in Docker registry
Kubernetes Integration: Native K8s deployment
Security Scanning: SAST, DAST, dependency scanning
Merge Requests: Code review process
Protected Branches: Enforce merge requirements
Repository Structure
Separation of Concerns
Best Practice: Separate application code from infrastructure configuration.
Reasons:
- Different lifecycles and release cadences
- Different teams and approval processes
- Configuration changes shouldnβt trigger app rebuilds
- Security and access control separation
Repository Organization
Pattern 1: Monorepo for Small Projects
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
project/
βββ src/ # Application code
β βββ frontend/
β βββ backend/
β βββ shared/
βββ infrastructure/ # Infrastructure code
β βββ terraform/
β β βββ modules/
β β βββ environments/
β β β βββ dev/
β β β βββ staging/
β β β βββ prod/
β β βββ main.tf
β βββ kubernetes/
β βββ base/
β βββ overlays/
β βββ dev/
β βββ staging/
β βββ prod/
βββ .github/
β βββ workflows/
βββ docs/
Pattern 2: Multi-Repo for Enterprise
Application Repository:
1
2
3
4
5
6
7
8
9
app-user-service/
βββ src/
βββ tests/
βββ Dockerfile
βββ .github/
β βββ workflows/
β βββ ci.yml
β βββ build.yml
βββ README.md
Infrastructure Repository:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
infrastructure/
βββ terraform/
β βββ modules/
β β βββ vpc/
β β βββ eks/
β β βββ rds/
β βββ environments/
β βββ dev/
β βββ staging/
β βββ prod/
βββ .github/
β βββ workflows/
β βββ terraform.yml
βββ README.md
Configuration Repository (GitOps):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
gitops-config/
βββ clusters/
β βββ dev/
β β βββ apps/
β β βββ infrastructure/
β β βββ system/
β βββ staging/
β βββ prod/
βββ apps/
β βββ user-service/
β β βββ base/
β β β βββ deployment.yaml
β β β βββ service.yaml
β β β βββ kustomization.yaml
β β βββ overlays/
β β βββ dev/
β β βββ staging/
β β βββ prod/
β βββ payment-service/
βββ README.md
Pattern 3: Package Repository
Platform Repository (Fleet-wide config):
1
2
3
4
5
6
7
8
9
10
11
12
platform-config/
βββ cluster-config/
β βββ namespaces/
β βββ rbac/
β βββ network-policies/
βββ shared-services/
β βββ ingress-nginx/
β βββ cert-manager/
β βββ monitoring/
βββ policies/
βββ pod-security/
βββ network/
###