Coding Notes

Table of content

Overview
Network topology
Getting a domain
- DDNS
VPN
Local DNS
Container orchestration
- Installation
- Bonus
Persistent storage
Backup
Gateway
- Installation
  - Traefik
  - Istio
- TLS
Cluster UI
CI/CD
- Drone CI
  - Installation
- Harbor
  - Caching images
🚧 Observability

Overview

Brief overview of the tools I use.

"🚧" means under construction, subdue to change, or still thinking about what tools to use

Hardware, OS

Hardware
- Server
  - CPU: Intel(R) Core(TM) i5-14500
  - RAM: 64G
  - Storage: 1T M.2 NVMe
- VPN server:
  
  My old PC, currently just runs the VPN server
  - CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
  - RAM: 8G
  - Storage:
    - 128G SSD
    - 1T HDD
OS: Ubuntu 24.04

DNS

External Domain registration and DNS provider: CloudFlare
- Uses CloudFlare API and ipify to implement DDNS, updates DNS record every 5 minutes, runs locally as a docker container
Local DNS: CoreDNS
- Self hosted with docker
- Used for registering domains that are only accessible in my local network

VPN

OpenVPN with TUN interface

Container orchestration

K3s with Cilium as kube-proxy replacement
Rancher as the UI for K3s

Persistent Storage

Longhorn
Rancher Local-Path, pre-installed with k3s
NFS server for storing backup

API Gateway

Istio and K8s gateway API for external traffic
Traefik for internal use ( domains accessible in LAN )
Automated TLS with Cert-manager and Let's encrypt with DNS challenge

CI / CD

Drone CI
Helm: stored in a git repository
Harbor: self hosted image registry

Monitoring:

Hardware and resources:
- Prometheus: collect metrics
- Grafana: visualization
[🚧] Network traffic:
- L3/L4: Cilium Hubble
- L7: Istio Telemetry and Zipkin
[🚧] Logs:
- Loki or Elasticsearch: for storing logs
- Gafana agent or Filebeat + Logstash: collect logs
[🚧] Alerting:
- Alertmanager

Network topology

I have two PCs, one serves as the main server, and the other as my VPN server ( My modem-router doesn't have VPN built in ).
server-setup

The main server and VPN server is connected to the same modem-router

Getting a domain

I chose CloudFlare for domain registration and DNS provider.
The documentations are pretty straightforward. Bought a domain and set DNS records through the UI.

DDNS

Because I don't have a static public IP, I need to setup DDNS ( Dynamic DNS ) for my domains to work.
To do this, I used ipify's API to get my current public IP, and CloudFlare's API to update my DNS A record every 5 minutes.
My DNS records consist of one A record and a bunch of CNAME records, this way I only need to keep one domain up to date.

Latter on, I found that I could get a static IP from my ISP quite easily. Haven't done it yet, might try it out later.

VPN

As the picture shown before in Server setup section, the VPN server is ran on a separate host.
Originally, I plained on running everything on a single server, but I quickly ran into an obstacle.
Because I use L2 Announcement with Cilium for my Load Balancer services on K3s, in order to connect to any of the services with their external IP, the client must be able to receive ARP responses.
This is fine when I'm at home, and it also works when OpenVPN is in bridge mode, which uses a TAP interface.

A TAP interface is a virtual network interface with a MAC address, which is needed for ARP and broadcasts.

But I plain to share this homelab with my friend that uses a macbook, and MACBOOK DOENS'T SUPPORT TAP INTERFACES, so I had to run OpenVPN in TUN mode, a TUN interface does not have a MAC address, thus can not receive ARP responses from the server !
To solve this issue, we need to let another interface receive / send ARP for us.

My first thought was to use docker, if I ran OpenVPN in a docker container we will have two interfaces, one is our TUN interface created by OpenVPN, and the other is an interface implemented with veth pair that is connected to docker bridge which does have a MAC address, this way the interface that has a MAC address can help us send and receive ARP responses.

I was wrong, although ARP requests are sent, ARP response never came back, the host knows this ARP request is originated from itself, and Cilium, which is responsible for L2 Announcements ignores it.
Later, I even tried using a VM instead of docker, but the result is the same.

If you ssh into the server and try to access a load balancer service, it works, but no ARP will be sent, I guess cilium has some workaround for this.

In the end, I separated the VPN server from the main server, and everything worked fine.

Install OpenVPN server

For detailed explanation, please checkout Ubuntu's OpenVPN installation guide.

Link: https://ubuntu.com/server/docs/how-to-install-and-use-openvpn

Here are the necessary steps:

Install OpenVPN and easy-rsa

You could use other tools such as openssl to generate CA.
But this is the easiest way to do it.

apt install openvpn easy-rsa

Move easyrsa binary into the bin folder.

install /usr/share/easy-rsa/easyrsa /usr/local/bin

Set up PKI ( public key infrastructure )

Create easy-rsa directory

mkdir /etc/openvpn/easy-rsa
cd /etc/openvpn/easy-rsa

Initialize and create CA

easyrsa init-pki
easyrsa build-ca

Server private key, certificate and Diffie Hellman

easyrsa gen-req SERVERNAME nopass
easyrsa gen-dh
easyrsa sign-req server SERVERNAME

Copy them all to /etc/openvpn/ for easier reference, we will use them in our server configuration.

cp pki/dh.pem pki/ca.crt pki/issued/SERVERNAME.crt pki/private/SERVERNAME.key /etc/openvpn/

Generate TLS Authentication key at /etc/openvpn/

openvpn --genkey secret tls-auth.key

Client private key and certificate

easyrsa gen-req CLIENTNAME nopass
easyrsa sign-req client CLIENTNAME

Example configurations

At this point you have finished the installation.
Here are some example configurations before we start the server.
You could find more examples at /usr/share/doc/openvpn/examples/sample-config-files/

Server

The file is a bit large, mostly consists of comments, but those are necessary for anyone that wants to understand the server configuration.

/etc/openvpn/server.conf

# Which TCP/UDP port should OpenVPN listen on?
# If you want to run multiple OpenVPN instances
# on the same machine, use a different port
# number for each one.  You will need to
# open up this port on your firewall.
port 1194

# TCP or UDP server?
# I couldn't get udp to work with TLS, so I went with tcp
proto tcp
;proto udp

# "dev tun" will create a routed IP tunnel,
# "dev tap" will create an ethernet tunnel.
# Use "dev tap0" if you are ethernet bridging
# and have precreated a tap0 virtual interface
# and bridged it with your ethernet interface.
# If you want to control access policies
# over the VPN, you must create firewall
# rules for the the TUN/TAP interface.
# On non-Windows systems, you can give
# an explicit unit number, such as tun0.
# On Windows, use "dev-node" for this.
# On most systems, the VPN will not function
# unless you partially or fully disable
# the firewall for the TUN/TAP interface.
dev tun

# SSL/TLS root certificate (ca), certificate
# (cert), and private key (key).  Each client
# and the server must have their own cert and
# key file.  The server and all clients will
# use the same ca file.
#
# See the "easy-rsa" directory for a series
# of scripts for generating RSA certificates
# and private keys.  Remember to use
# a unique Common Name for the server
# and each of the client certificates.
#
# Any X509 key management system can be used.
# OpenVPN can also use a PKCS #12 formatted key file
# (see "pkcs12" directive in man page).
ca ca.crt
cert SERVERNAME.crt
key SERVERNAME.key  # This file should be kept secret

# Diffie hellman parameters.
# Generate your own with:
#   openssl dhparam -out dh2048.pem 2048
dh dh.pem

# Network topology
# Should be subnet (addressing via IP)
# unless Windows clients v2.0.9 and lower have to
# be supported (then net30, i.e. a /30 per client)
# Defaults to net30 (not recommended)
topology subnet

# Configure server mode and supply a VPN subnet
# for OpenVPN to draw client addresses from.
# The server will take 10.99.99.1 for itself,
# the rest will be made available to clients.
# Each client will be able to reach the server
# on 10.99.99.1. Comment this line out if you are
# ethernet bridging. See the man page for more info.
server 10.99.99.0 255.255.255.0

# Maintain a record of client <-> virtual IP address
# associations in this file.  If OpenVPN goes down or
# is restarted, reconnecting clients can be assigned
# the same virtual IP address from the pool that was
# previously assigned.
ifconfig-pool-persist /var/log/openvpn/ipp.txt

# Let the client know it should route through VPN when 
# connecting to anything on our servers.
push "route 192.168.99.0 255.255.255.0" 

# Push DNS server config to clients.
# There is a DNS server on my main server.
# This way clients can resolve private domains.
push "dhcp-option DNS 192.168.99.55"

# The keepalive directive causes ping-like
# messages to be sent back and forth over
# the link so that each side knows when
# the other side has gone down.
# Ping every 10 seconds, assume that remote
# peer is down if no ping received during
# a 120 second time period.
keepalive 10 120

# For extra security beyond that provided
# by SSL/TLS, create an "HMAC firewall"
# to help block DoS attacks and UDP port flooding.
#
# Generate with:
#   openvpn --genkey secret ta.key
#
# The server and each client must have
# a copy of this key.
# The second parameter should be '0'
# on the server and '1' on the clients.
tls-auth tls-auth.key 0 # This file is secret
auth SHA256

# Select a cryptographic cipher.
# This config item must be copied to
# the client config file as well.
# Note that v2.4 client/server will automatically
# negotiate AES-256-GCM in TLS mode.
# See also the ncp-cipher option in the manpage
cipher AES-256-GCM

# The persist options will try to avoid
# accessing certain resources on restart
# that may no longer be accessible because
# of the privilege downgrade.
persist-key
persist-tun

# Output a short status file showing
# current connections, truncated
# and rewritten every minute.
status /var/log/openvpn/openvpn-status.log

# Set the appropriate level of log
# file verbosity.
#
# 0 is silent, except for fatal errors
# 4 is reasonable for general usage
# 5 and 6 can help to debug connection problems
# 9 is extremely verbose
verb 3

Client

Here we use "single-file" configuration

/etc/openvpn/client.conf

# Specify that we are a client and that we
# will be pulling certain config file directives
# from the server.
client

# Use the same setting as you are using on
# the server.
# On most systems, the VPN will not function
# unless you partially or fully disable
# the firewall for the TUN/TAP interface.
dev tun

# Are we connecting to a TCP or
# UDP server?  Use the same setting as
# on the server.
proto tcp

# The hostname/IP and port of the server.
# You can have multiple remote entries
# to load balance between the servers.
remote REMOTE_DOMAIN 1194

# Don't route evrything through VPN.
# Only route through VPN when necessary.
# Such as: accessing our servers or some service only avlailable inside our LAN.
pull-filter ignore redirect-gateway

# I have DNS installed on my host, so I will ignore the DNS setting 
# sent from the server and adjust DNS zone files on the client side.
# If you don't have extra DNS servers, you can comment this line out and just
# let the server setup your DNS setting when connected to VPN.
pull-filter ignore "dhcp-option DNS"

# Keep trying indefinitely to resolve the
# host name of the OpenVPN server.  Very useful
# on machines which are not permanently connected
# to the internet such as laptops.
resolv-retry infinite

# Most clients don't need to bind to
# a specific local port number.
nobind

# Try to preserve some state across restarts.
persist-key
persist-tun

# Add 'pull' to accept server 'pushed' settings,
# Our server will push DNS settings and routing confgurations.
# We can ignore part of the 'pushed' settings like we did above.
pull

# Let openvpn run scripts.
# 'script-security 2' will give openvpn permission.
script-security 2

# If I didn't ignore DNS settings pushed by the server, 
# this will allow us to update DNS settings.
# It uses 'resolvconf' make sure it's installed !!!
# This scripts should be present if you installed OpenVPN on Ubuntu.
up /etc/openvpn/update-resolv-conf
down /etc/openvpn/update-resolv-conf

# SSL/TLS parms.
# See the server config file for more
# description.  It's best to use
# a separate .crt/.key file pair
# for each client.  A single ca
# file can be used for all clients.

#ca ca.crt
# cert alexfunmula.crt
# key alexfunmula.key

<ca>
-----BEGIN CERTIFICATE-----
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
-----END CERTIFICATE-----
</ca>

<cert>
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
</cert>

<key>
-----BEGIN PRIVATE KEY-----
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
-----END PRIVATE KEY-----
</key>

# Verify server certificate by checking that the
# certificate has the correct key usage set.
# This is an important precaution to protect against
# a potential attack discussed here:
#  http://openvpn.net/howto.html#mitm
#
# To use this feature, you will need to generate
# your server certificates with the keyUsage set to
#   digitalSignature, keyEncipherment
# and the extendedKeyUsage to
#   serverAuth
# EasyRSA can do this for you.
remote-cert-tls server

# alg used for tls-auth HMAC
auth SHA256

# Static key 
# If a tls-auth key is used on the server
# then every client must also have the key.

# tls-auth tls-auth.key 1

<tls-auth>
#
# 2048 bit OpenVPN static key
#
-----BEGIN OpenVPN Static key V1-----
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXX
-----END OpenVPN Static key V1-----
</tls-auth>

key-direction 1

# Select a cryptographic cipher.
# If the cipher option is used on the server
# then you must also specify it here.
# Note that v2.4 client/server will automatically
# negotiate AES-256-GCM in TLS mode.
# See also the data-ciphers option in the manpage
#
# I stll don't quite get this part...
cipher AES-256-GCM

# Set log file verbosity.
verb 3

Start the server

Place server configuration at /etc/openvpn. Start openvpn service if needed.

systemctl start openvpn.service

Start the server with our configuration. In this case, our server configuration is called server.conf, that's why we typed @server after openvpn

systemctl start openvpn@server

Check if it started correctly

journalctl -u openpvn@server -f

Aug 14 15:17:16 alex-server-2 systemd[1]: Starting [email protected] - OpenVPN connection to server...
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Note: Kernel support for ovpn-dco missing, disabling data channel offload.
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: OpenVPN 2.6.9 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] [DCO]
Aug 14 15:17:16 alex-server-2 systemd[1]: Started [email protected] - OpenVPN connection to server.
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: library versions: OpenSSL 3.0.13 30 Jan 2024, LZO 2.10
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: DCO version: N/A
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: net_route_v4_best_gw query: dst 0.0.0.0
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: net_route_v4_best_gw result: via 192.168.99.1 dev enp3s0
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Diffie-Hellman initialized with 2048 bit key
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: TUN/TAP device tun0 opened
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: net_iface_mtu_set: mtu 1500 for tun0
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: net_iface_up: set tun0 up
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: net_addr_v4_add: 10.99.99.1/24 dev tun0
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Could not determine IPv4/IPv6 protocol. Using AF_INET
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Socket Buffers: R=[131072->131072] S=[16384->16384]
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Listening for incoming TCP connection on [AF_INET][undef]:1999
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: TCPv4_SERVER link local (bound): [AF_INET][undef]:1999
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: TCPv4_SERVER link remote: [AF_UNSPEC]
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: MULTI: multi_init called, r=256 v=256
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: IFCONFIG POOL IPv4: base=10.99.99.2 size=253
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: IFCONFIG POOL LIST
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: MULTI: TCP INIT maxclients=1024 maxevents=1029
Aug 14 15:17:16 alex-server-2 ovpn-server[6903]: Initialization Sequence Completed

Check if tun interface is created

ip a

3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none
    inet 10.99.99.1/24 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 fe80::8244:bf6:d39e:11f/64 scope link stable-privacy
       valid_lft forever preferred_lft forever

Routing configuration

In order for the client to reach IPs outside of the vpn server, we need to add some more settings

Enable forwarding at `/etc/sysctl.conf`

net.ipv4.ip_forward=1

Run sysctl -p to apply changes

Set up MASQUERADE for outgoing traffic from vpn client

iptables -t nat -A POSTROUTING -s 10.99.99.0/24 -o enp3s0 -j MASQUERADE

10.99.99.0/24 is the CIDR for VPN connections, enp3s0 is the interface that connects to our LAN. The name of the interface might be defferent, please make adjustments accordingly.

Set iptables FORWARD policy to ACCEPT

Normally we don't need to do this, ACCEPT is the default for FORWARD. But if we have docker installed, docker changes the policy to DENY. We need to set it back to FORWARD.

iptables --policy FORWARD ACCEPT

Or we could add rules in iptables that specificly allows forwarding from VPN clients, I just choose to update the default.

Persist changes made to iptables

iptables-save > /etc/iptables/rules.v4

Might need to install iptables-persistent first

Start the client

Place client configuration at /etc/openvpn/
Start openvpn service if needed.

systemctl start openvpn.service

Start with our client conifg (Same concept as selecting server config)

systemctl start openvpn@client

Check interfaces, ip routes and some other setting we have pulled from server.

Connection check

Ping the server

ping 10.99.99.1

PING 10.99.99.1 (10.99.99.1) 56(84) bytes of data.
64 bytes from 10.99.99.1: icmp_seq=1 ttl=64 time=4.76 ms
64 bytes from 10.99.99.1: icmp_seq=2 ttl=64 time=5.21 ms
...

Ping stuffs outside the server ( one that needs to be routed through VPN of course )

ping 192.168.99.55

PING 192.168.99.55 (192.168.99.55) 56(84) bytes of data.
64 bytes from 192.168.99.55: icmp_seq=1 ttl=63 time=6.01 ms
64 bytes from 192.168.99.55: icmp_seq=2 ttl=63 time=5.22 ms
...

Local DNS

With a local DNS, we can create domain freely in our LAN.

Installation

I went with CoreDNS, and ran it as a docker container.

Configuration file structure:

.
├── config
│   ├── Corefile
│   └── db.alexfangsw.com
└── run.sh

Corefile:

# Any thing that this DNS server doesn't know will be forwarded to 
# external DNS: 1.1.1.1 ( cloudflare ), 8.8.8.8 ( google )
. {
	reload	
	forward . 1.1.1.1 8.8.8.8
	cache 300
	log
}

# This 'zone' is for domains in my LAN.
# Domains that match this patterns ( "cloud.alexfangsw.com", "*.cloud.alexfangsw.com" )
# could be resolved in this zone.
cloud.alexfangsw.com {
	reload
	auto {
        # This is the 'zone file', it contains DNS record for this zone
		directory db.alexfangsw.com 
	}
	cache 300
	log
}

db.alexfangsw.com: ( zone file )

$ORIGIN cloud.alexfangsw.com.
@ 3600 IN SOA dns.cloud.alexfangsw.com. [email protected]. (
    2       ; serial
    7200    ; refresh
    3600    ; retry
    1209600 ; expire
    3600    ; minimum
    )

; This is our 'traefik' and 'istio' load balancer IP
; You might have a different IP if you installed Cilium with 
; a different configuration.
; We will get to install Cilium in later sections.

cloud.alexfangsw.com. IN A    192.168.99.60
gateway-istio.  IN A    192.168.99.61

; Because there isn't a '.' at the and of the domain, which means it's not a FQDN,
; $ORIGIN ( cloud.alexfangsw.com. ) will be added at the end automaticaly.
; ex: 
;   aaa -> aaa.cloud.biggo.com
;   *   -> *.cloud.biggo.com
;
; Wildcard is supported, for this CNAME, I want all subdomains that matches that 
; pattern to resolve their IPs as 192.168.99.60

*   IN CNAME cloud.alexfangsw.com.

run.sh:

#!/bin/bash

# Set a fixed IP for this container to make sure it's consistent throughout 
# restarts, we will be directing all DNS resolution to this DNS after we started
# this container.

docker run -d --name coredns \
    --mount type=bind,source="$PWD/config",target="/coreDNS/config" \
    -p 53:53/tcp -p 53:53/udp \
    -w /coreDNS/config \
    --restart=unless-stopped \
    --ip 172.20.0.2 \
    --network infra \
    coredns/coredns \
    -conf Corefile -dns.port 53 && \
    docker logs -f coredns

Before we start the container, we need to create a docker network, this is necessary if we wish to set a fixed IP for our container.

docker network create infra --subnet 172.20.0.0/24

Start the DNS server

./run.sh

Lets first check that it can resolve external domains.

dig google.com @172.20.0.2

I remove some of the output, the key point here is that we used our DNS server to resolve the domain, and it was successful.

;; ANSWER SECTION:
google.com.             105     IN      A       142.251.42.238

;; Query time: 0 msec
;; SERVER: 172.20.0.2#53(172.20.0.2) (UDP)
;; WHEN: Fri Sep 06 10:53:44 CST 2024
;; MSG SIZE  rcvd: 77

Now for the internal domains.

dig cloud.alexfangsw.com @172.20.0.2

It successfully resolved our domain with the A record. :)

;; ANSWER SECTION:
cloud.alexfangsw.com.   5       IN      A       192.168.99.60

;; Query time: 1 msec
;; SERVER: 172.20.0.2#53(172.20.0.2) (UDP)
;; WHEN: Fri Sep 06 02:59:56 UTC 2024
;; MSG SIZE  rcvd: 97

Let check if wildcard CNAME works.

dig aaa.bbb.cloud.alexfangsw.com @172.20.0.2

Ya~ It resolved it with the CNAME record.

;; ANSWER SECTION:
aaa.bbb.cloud.alexfangsw.com. 5 IN      CNAME   cloud.alexfangsw.com.
cloud.alexfangsw.com.   5       IN      A       192.168.99.60

;; Query time: 0 msec
;; SERVER: 172.20.0.2#53(172.20.0.2) (UDP)
;; WHEN: Fri Sep 06 03:01:25 UTC 2024
;; MSG SIZE  rcvd: 167

Set self hosted DNS as default

The last step is to set our self hosted DNS as the default DNS for our machine.

Edit `/etc/systemd/resolved.conf`

# /etc/systemd/resolved.conf
[Resolve]
DNS=172.20.0.2
DNSStubListener=no

Restart `systemd-resolved`

To apply our settings, we need to restart systemd-resolved

systemctl restart systemd-resolved.service

Now if we check the logs, we should see that all DNS resolutions goes through our self hosted DNS server.

docker logs -f coredns

...
[INFO] 172.18.0.1:38664 - 26709 "A IN mqtt1.earthquake.tw. udp 37 false 512" NOERROR qr,aa,rd,ra 72 0.00006474s
[INFO] 172.18.0.1:38664 - 56411 "AAAA IN mqtt1.earthquake.tw. udp 37 false 512" NOERROR qr,aa,rd,ra 129 0.0000914s
[INFO] 172.18.0.1:57084 - 61796 "A IN mqtt1.earthquake.tw. udp 37 false 512" NOERROR qr,aa,rd,ra 72 0.00007199s
...

Container orchestration

K3s is a lightweight K8s distribution developed by Rancher labs, while K8s runs different components as separate processes, K3s runs them as separate goroutines. Easy to setup, can run as single node or multi node cluster.

As for documentation, you could just read the K8s documentation, their functionality are basically the same ( deployments, daemonsets, statefulsets, RBAC, configmaps, secrets... etc), the main difference I find is how K3s loads it's configuration, such as feature gates, kubelet config, but those are documented on the K3s documentation.

Cilium is used here as our CNI plugin and kube-proxy replacement.

Lets take a look at kube-proxy, kube-proxy uses iptables to manage the netwrok of our cluster, here is a example of how iptables routes traffic to a pod.
Assuming that we are connecting to 'Kibana' using it's ClusterIP service.

kubectl -n elasticsearch get svc

NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
quickstart-kb-http            ClusterIP   10.43.220.144   <none>        5601/TCP   299d

kubectl -n elasticsearch get pods -o wide

NAME                             READY   STATUS    RESTARTS       AGE     IP            NODE                       NOMINATED NODE   READINESS GATES
quickstart-kb-848f768f6d-c8mzz   1/1     Running   305 (5h ago)   298d    10.42.0.193   alex-system-product-name   <none>           <none>

The CluterIP is "10.43.220.144" and the The pod IP is "10.42.0.193".

This is my old K3s cluster that is deployed on my PC, I power it off every day, that's why the restart count is pretty high.

Now lets look at the iptable rules needed for this operation.

# We start from PREROUTING.
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

# Jump to one of the KUBE-SERVICES that matches the cluster IP and port.
-A KUBE-SERVICES -d 10.43.220.144/32 -p tcp -m comment --comment "elasticsearch/quickstart-kb-http:http cluster IP" -m tcp --dport 5601 -j KUBE-SVC-O6UG2CVGYA7SXHZV

# Jump to another rule that checks if the request is from one of the nodes inside this cluster.
# Nodes in this cluster will have a flannel interface with CIDR of 10.42.0.0/16.
-A KUBE-SVC-O6UG2CVGYA7SXHZV ! -s 10.42.0.0/16 -d 10.43.220.144/32 -p tcp -m comment --comment "elasticsearch/quickstart-kb-http:http cluster
IP" -m tcp --dport 5601 -j KUBE-MARK-MASQ
# To make things simpler, lets assume that we made the request on one of the nodes
# and matched this rule.
-A KUBE-SVC-O6UG2CVGYA7SXHZV -m comment --comment "elasticsearch/quickstart-kb-http:http -> 10.42.0.193:5601" -j KUBE-SEP-CGVFLKT7QTRSIZZO

# At this stage, our destination IP is changed into the pod IP.
-A KUBE-SEP-CGVFLKT7QTRSIZZO -p tcp -m comment --comment "elasticsearch/quickstart-kb-http:http" -m tcp -j DNAT --to-destination 10.42.0.193:5
601

# Because we made the request inside the cluster, this POSTROUTING rules doesn't do much.
# And we have successfully connected to our pod !!
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully

The key point here is that iptables uses IPs to know were each request needs to go, in the world of Kubernetes, IPs and pods are basically throwaway stuff, every time we redeploy a pod, all iptables on all nodes needs to be updated, if any of the node fails to sync their iptables , the cluster will fall into a weird state, this is not ideal.

Cilium solves this problem by using eBPF, using eBPF, we can implement routing rules with different programming languages, and can access more information then iptables, which means we are not restricted to make routing decisions based on IPs, and leads to more optimization along the way.

Installation

/etc/rancher/k3s/config.yaml

cluster-init: true

# Custom resolve.conf, makes our cluster use our local dns.
# Although this is normally in kubelet config file, k3s puts it here.
resolv-conf: /root/infra/resolv.conf

# Disable traefik and serviceLB, we will be installing traefik ourselfs, and use Cilium for 
# our load balancers.
disable:
        - servicelb
        - traefik
# Cilium will replace all of this, disable them all.
disable-kube-proxy: true
flannel-backend: "none"
disable-network-policy: true

kubelet-arg:
        - config=<PATH TO kubeletconfig.yaml>
advertise-address:
        - 192.168.99.55
node-ip:
        - 192.168.99.55
token:
        - SOME TOKEN
debug: false

/root/infra/resolv.conf

nameserver 192.168.99.55

kubeletconfig.yaml

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
maxPods: 250
featureGates:
  MatchLabelKeysInPodTopologySpread: true
# Reserve resource for system and k8s related services,
# this will let kubelet have enough resource to send node status.
kubeReserved:
  cpu: 200m
  memory: 500Mi
systemReserved:
  cpu: 200m
  memory: 500Mi

Install K3s

A list of K3s channels: https://update.k3s.io/v1-release/channels

curl -sfL https://get.k3s.io | INSTALL_K3S_CHANNEL=v1.28 sh -s - server

No pods can be deployed for now since Cilium isn't installed. Just keep going.

Cilium values.yaml

We will be installing Cilium with helm. The full configuration for Cilium is quite large, so I folded it by default.
I basically used helm to get the full values.yaml and made some adjustments on this fields.

k8sServiceHost
devices
l2announcements
ipv4NativeRoutingCIDR
ipam
kubeProxyReplacement
enableIPv4Masquerade
ipv4NativeRoutingCIDR

Open to see values.yaml for cilium

# @schema
# type: [null, string]
# @schema
# -- upgradeCompatibility helps users upgrading to ensure that the configMap for
# Cilium will not change critical values to ensure continued operation
# This flag is not required for new installations.
# For example: '1.7', '1.8', '1.9'
upgradeCompatibility: null
debug:
  # -- Enable debug logging
  enabled: false
  # @schema
  # type: [null, string]
  # @schema
  # -- Configure verbosity levels for debug logging
  # This option is used to enable debug messages for operations related to such
  # sub-system such as (e.g. kvstore, envoy, datapath or policy), and flow is
  # for enabling debug messages emitted per request, message and connection.
  # Multiple values can be set via a space-separated string (e.g. "datapath envoy").
  #
  # Applicable values:
  # - flow
  # - kvstore
  # - envoy
  # - datapath
  # - policy
  verbose: ~
rbac:
  # -- Enable creation of Resource-Based Access Control configuration.
  create: true
# -- Configure image pull secrets for pulling container images
imagePullSecrets: []
# - name: "image-pull-secret"

# -- (string) Kubernetes config path
# @default -- `"~/.kube/config"`
kubeConfigPath: ""
# -- (string) Kubernetes service host - use "auto" for automatic lookup from the cluster-info ConfigMap (kubeadm-based clusters only)
k8sServiceHost: "192.168.99.55"
# @schema
# type: [string, integer]
# @schema
# -- (string) Kubernetes service port
k8sServicePort: "6443"
# -- Configure the client side rate limit for the agent and operator
#
# If the amount of requests to the Kubernetes API server exceeds the configured
# rate limit, the agent and operator will start to throttle requests by delaying
# them until there is budget or the request times out.
k8sClientRateLimit:
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) The sustained request rate in requests per second.
  # @default -- 5 for k8s up to 1.26. 10 for k8s version 1.27+
  qps:
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) The burst request rate in requests per second.
  # The rate limiter will allow short bursts with a higher rate.
  # @default -- 10 for k8s up to 1.26. 20 for k8s version 1.27+
  burst:
cluster:
  # -- Name of the cluster. Only required for Cluster Mesh and mutual authentication with SPIRE.
  # It must respect the following constraints:
  # * It must contain at most 32 characters;
  # * It must begin and end with a lower case alphanumeric character;
  # * It may contain lower case alphanumeric characters and dashes between.
  # The "default" name cannot be used if the Cluster ID is different from 0.
  name: default
  # -- (int) Unique ID of the cluster. Must be unique across all connected
  # clusters and in the range of 1 to 255. Only required for Cluster Mesh,
  # may be 0 if Cluster Mesh is not used.
  id: 0
# -- Define serviceAccount names for components.
# @default -- Component's fully qualified name.
serviceAccounts:
  cilium:
    create: true
    name: cilium
    automount: true
    annotations: {}
  nodeinit:
    create: true
    # -- Enabled is temporary until https://github.com/cilium/cilium-cli/issues/1396 is implemented.
    # Cilium CLI doesn't create the SAs for node-init, thus the workaround. Helm is not affected by
    # this issue. Name and automount can be configured, if enabled is set to true.
    # Otherwise, they are ignored. Enabled can be removed once the issue is fixed.
    # Cilium-nodeinit DS must also be fixed.
    enabled: false
    name: cilium-nodeinit
    automount: true
    annotations: {}
  envoy:
    create: true
    name: cilium-envoy
    automount: true
    annotations: {}
  operator:
    create: true
    name: cilium-operator
    automount: true
    annotations: {}
  preflight:
    create: true
    name: cilium-pre-flight
    automount: true
    annotations: {}
  relay:
    create: true
    name: hubble-relay
    automount: false
    annotations: {}
  ui:
    create: true
    name: hubble-ui
    automount: true
    annotations: {}
  clustermeshApiserver:
    create: true
    name: clustermesh-apiserver
    automount: true
    annotations: {}
  # -- Clustermeshcertgen is used if clustermesh.apiserver.tls.auto.method=cronJob
  clustermeshcertgen:
    create: true
    name: clustermesh-apiserver-generate-certs
    automount: true
    annotations: {}
  # -- Hubblecertgen is used if hubble.tls.auto.method=cronJob
  hubblecertgen:
    create: true
    name: hubble-generate-certs
    automount: true
    annotations: {}
# -- Configure termination grace period for cilium-agent DaemonSet.
terminationGracePeriodSeconds: 1
# -- Install the cilium agent resources.
agent: true
# -- Agent container name.
name: cilium
# -- Roll out cilium agent pods automatically when configmap is updated.
rollOutCiliumPods: false
# -- Agent container image.
image:
  # @schema
  # type: [null, string]
  # @schema
  override: ~
  repository: "quay.io/cilium/cilium"
  tag: "v1.16.0"
  pullPolicy: "IfNotPresent"
  # cilium-digest
  digest: "sha256:46ffa4ef3cf6d8885dcc4af5963b0683f7d59daa90d49ed9fb68d3b1627fe058"
  useDigest: true
# -- Affinity for cilium-agent.
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      - topologyKey: kubernetes.io/hostname
        labelSelector:
          matchLabels:
            k8s-app: cilium
# -- Node selector for cilium-agent.
nodeSelector:
  kubernetes.io/os: linux
# -- Node tolerations for agent scheduling to nodes with taints
# ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
  - operator: Exists
    # - key: "key"
    #   operator: "Equal|Exists"
    #   value: "value"
    #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
# -- The priority class to use for cilium-agent.
priorityClassName: ""
# -- DNS policy for Cilium agent pods.
# Ref: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: ""
# -- Additional containers added to the cilium DaemonSet.
extraContainers: []
# -- Additional initContainers added to the cilium Daemonset.
extraInitContainers: []
# -- Additional agent container arguments.
extraArgs: []
# -- Additional agent container environment variables.
extraEnv: []
# -- Additional agent hostPath mounts.
extraHostPathMounts: []
# - name: host-mnt-data
#   mountPath: /host/mnt/data
#   hostPath: /mnt/data
#   hostPathType: Directory
#   readOnly: true
#   mountPropagation: HostToContainer

# -- Additional agent volumes.
extraVolumes: []
# -- Additional agent volumeMounts.
extraVolumeMounts: []
# -- extraConfig allows you to specify additional configuration parameters to be
# included in the cilium-config configmap.
extraConfig: {}
#  my-config-a: "1234"
#  my-config-b: |-
#    test 1
#    test 2
#    test 3

# -- Annotations to be added to all top-level cilium-agent objects (resources under templates/cilium-agent)
annotations: {}
# -- Security Context for cilium-agent pods.
podSecurityContext:
  # -- AppArmorProfile options for the `cilium-agent` and init containers
  appArmorProfile:
    type: "Unconfined"
# -- Annotations to be added to agent pods
podAnnotations: {}
# -- Labels to be added to agent pods
podLabels: {}
# -- Agent resource limits & requests
# ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
resources: {}
#   limits:
#     cpu: 4000m
#     memory: 4Gi
#   requests:
#     cpu: 100m
#     memory: 512Mi

# -- resources & limits for the agent init containers
initResources: {}
securityContext:
  # -- User to run the pod with
  # runAsUser: 0
  # -- Run the pod with elevated privileges
  privileged: false
  # -- SELinux options for the `cilium-agent` and init containers
  seLinuxOptions:
    level: 's0'
    # Running with spc_t since we have removed the privileged mode.
    # Users can change it to a different type as long as they have the
    # type available on the system.
    type: 'spc_t'
  capabilities:
    # -- Capabilities for the `cilium-agent` container
    ciliumAgent:
      # Use to set socket permission
      - CHOWN
      # Used to terminate envoy child process
      - KILL
      # Used since cilium modifies routing tables, etc...
      - NET_ADMIN
      # Used since cilium creates raw sockets, etc...
      - NET_RAW
      # Used since cilium monitor uses mmap
      - IPC_LOCK
      # Used in iptables. Consider removing once we are iptables-free
      - SYS_MODULE
      # Needed to switch network namespaces (used for health endpoint, socket-LB).
      # We need it for now but might not need it for >= 5.11 specially
      # for the 'SYS_RESOURCE'.
      # In >= 5.8 there's already BPF and PERMON capabilities
      - SYS_ADMIN
      # Could be an alternative for the SYS_ADMIN for the RLIMIT_NPROC
      - SYS_RESOURCE
      # Both PERFMON and BPF requires kernel 5.8, container runtime
      # cri-o >= v1.22.0 or containerd >= v1.5.0.
      # If available, SYS_ADMIN can be removed.
      #- PERFMON
      #- BPF
      # Allow discretionary access control (e.g. required for package installation)
      - DAC_OVERRIDE
      # Allow to set Access Control Lists (ACLs) on arbitrary files (e.g. required for package installation)
      - FOWNER
      # Allow to execute program that changes GID (e.g. required for package installation)
      - SETGID
      # Allow to execute program that changes UID (e.g. required for package installation)
      - SETUID
    # -- Capabilities for the `mount-cgroup` init container
    mountCgroup:
      # Only used for 'mount' cgroup
      - SYS_ADMIN
      # Used for nsenter
      - SYS_CHROOT
      - SYS_PTRACE
    # -- capabilities for the `apply-sysctl-overwrites` init container
    applySysctlOverwrites:
      # Required in order to access host's /etc/sysctl.d dir
      - SYS_ADMIN
      # Used for nsenter
      - SYS_CHROOT
      - SYS_PTRACE
    # -- Capabilities for the `clean-cilium-state` init container
    cleanCiliumState:
      # Most of the capabilities here are the same ones used in the
      # cilium-agent's container because this container can be used to
      # uninstall all Cilium resources, and therefore it is likely that
      # will need the same capabilities.
      # Used since cilium modifies routing tables, etc...
      - NET_ADMIN
      # Used in iptables. Consider removing once we are iptables-free
      - SYS_MODULE
      # We need it for now but might not need it for >= 5.11 specially
      # for the 'SYS_RESOURCE'.
      # In >= 5.8 there's already BPF and PERMON capabilities
      - SYS_ADMIN
      # Could be an alternative for the SYS_ADMIN for the RLIMIT_NPROC
      - SYS_RESOURCE
      # Both PERFMON and BPF requires kernel 5.8, container runtime
      # cri-o >= v1.22.0 or containerd >= v1.5.0.
      # If available, SYS_ADMIN can be removed.
      #- PERFMON
      #- BPF
# -- Cilium agent update strategy
updateStrategy:
  type: RollingUpdate
  rollingUpdate:
    # @schema
    # type: [integer, string]
    # @schema
    maxUnavailable: 2
# Configuration Values for cilium-agent
aksbyocni:
  # -- Enable AKS BYOCNI integration.
  # Note that this is incompatible with AKS clusters not created in BYOCNI mode:
  # use Azure integration (`azure.enabled`) instead.
  enabled: false
# @schema
# type: [boolean, string]
# @schema
# -- Enable installation of PodCIDR routes between worker
# nodes if worker nodes share a common L2 network segment.
autoDirectNodeRoutes: true
# -- Enable skipping of PodCIDR routes between worker
# nodes if the worker nodes are in a different L2 network segment.
directRoutingSkipUnreachable: false
# -- Annotate k8s node upon initialization with Cilium's metadata.
annotateK8sNode: false
azure:
  # -- Enable Azure integration.
  # Note that this is incompatible with AKS clusters created in BYOCNI mode: use
  # AKS BYOCNI integration (`aksbyocni.enabled`) instead.
  enabled: false
  # usePrimaryAddress: false
  # resourceGroup: group1
  # subscriptionID: 00000000-0000-0000-0000-000000000000
  # tenantID: 00000000-0000-0000-0000-000000000000
  # clientID: 00000000-0000-0000-0000-000000000000
  # clientSecret: 00000000-0000-0000-0000-000000000000
  # userAssignedIdentityID: 00000000-0000-0000-0000-000000000000
alibabacloud:
  # -- Enable AlibabaCloud ENI integration
  enabled: false
# -- Enable bandwidth manager to optimize TCP and UDP workloads and allow
# for rate-limiting traffic from individual Pods with EDT (Earliest Departure
# Time) through the "kubernetes.io/egress-bandwidth" Pod annotation.
bandwidthManager:
  # -- Enable bandwidth manager infrastructure (also prerequirement for BBR)
  enabled: false
  # -- Activate BBR TCP congestion control for Pods
  bbr: false
# -- Configure standalone NAT46/NAT64 gateway
nat46x64Gateway:
  # -- Enable RFC8215-prefixed translation
  enabled: false
# -- EnableHighScaleIPcache enables the special ipcache mode for high scale
# clusters. The ipcache content will be reduced to the strict minimum and
# traffic will be encapsulated to carry security identities.
highScaleIPcache:
  # -- Enable the high scale mode for the ipcache.
  enabled: false
# -- Configure L2 announcements
l2announcements:
  # -- Enable L2 announcements
  enabled: true
  # -- If a lease is not renewed for X duration, the current leader is considered dead, a new leader is picked
  # leaseDuration: 15s
  # -- The interval at which the leader will renew the lease
  # leaseRenewDeadline: 5s
  # -- The timeout between retries if renewal fails
  # leaseRetryPeriod: 2s
# -- Configure L2 pod announcements
l2podAnnouncements:
  # -- Enable L2 pod announcements
  enabled: false
  # -- Interface used for sending Gratuitous ARP pod announcements
  interface: "eth0"
# -- Configure BGP
bgp:
  # -- Enable BGP support inside Cilium; embeds a new ConfigMap for BGP inside
  # cilium-agent and cilium-operator
  enabled: false
  announce:
    # -- Enable allocation and announcement of service LoadBalancer IPs
    loadbalancerIP: false
    # -- Enable announcement of node pod CIDR
    podCIDR: false
# -- This feature set enables virtual BGP routers to be created via
# CiliumBGPPeeringPolicy CRDs.
bgpControlPlane:
  # -- Enables the BGP control plane.
  enabled: false
  # -- SecretsNamespace is the namespace which BGP support will retrieve secrets from.
  secretsNamespace:
    # -- Create secrets namespace for BGP secrets.
    create: false
    # -- The name of the secret namespace to which Cilium agents are given read access
    name: kube-system
pmtuDiscovery:
  # -- Enable path MTU discovery to send ICMP fragmentation-needed replies to
  # the client.
  enabled: false
bpf:
  autoMount:
    # -- Enable automatic mount of BPF filesystem
    # When `autoMount` is enabled, the BPF filesystem is mounted at
    # `bpf.root` path on the underlying host and inside the cilium agent pod.
    # If users disable `autoMount`, it's expected that users have mounted
    # bpffs filesystem at the specified `bpf.root` volume, and then the
    # volume will be mounted inside the cilium agent pod at the same path.
    enabled: true
  # -- Configure the mount point for the BPF filesystem
  root: /sys/fs/bpf
  # -- Enables pre-allocation of eBPF map values. This increases
  # memory usage but can reduce latency.
  preallocateMaps: false
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) Configure the maximum number of entries in auth map.
  # @default -- `524288`
  authMapMax: ~
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) Configure the maximum number of entries in the TCP connection tracking
  # table.
  # @default -- `524288`
  ctTcpMax: ~
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) Configure the maximum number of entries for the non-TCP connection
  # tracking table.
  # @default -- `262144`
  ctAnyMax: ~
  # -- Control events generated by the Cilium datapath exposed to Cilium monitor and Hubble.
  events:
    drop:
      # -- Enable drop events.
      enabled: true
    policyVerdict:
      # -- Enable policy verdict events.
      enabled: true
    trace:
      # -- Enable trace events.
      enabled: true
  # @schema
  # type: [null, integer]
  # @schema
  # -- Configure the maximum number of service entries in the
  # load balancer maps.
  lbMapMax: 65536
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) Configure the maximum number of entries for the NAT table.
  # @default -- `524288`
  natMax: ~
  # @schema
  # type: [null, integer]
  # @schema
  # -- (int) Configure the maximum number of entries for the neighbor table.
  # @default -- `524288`
  neighMax: ~
  # @schema
  # type: [null, integer]
  # @schema
  # @default -- `16384`
  # -- (int) Configures the maximum number of entries for the node table.
  nodeMapMax: ~
  # -- Configure the maximum number of entries in endpoint policy map (per endpoint).
  # @schema
  # type: [null, integer]
  # @schema
  policyMapMax: 16384
  # @schema
  # type: [null, number]
  # @schema
  # -- (float64) Configure auto-sizing for all BPF maps based on available memory.
  # ref: https://docs.cilium.io/en/stable/network/ebpf/maps/
  # @default -- `0.0025`
  mapDynamicSizeRatio: ~
  # -- Configure the level of aggregation for monitor notifications.
  # Valid options are none, low, medium, maximum.
  monitorAggregation: medium
  # -- Configure the typical time between monitor notifications for
  # active connections.
  monitorInterval: "5s"
  # -- Configure which TCP flags trigger notifications when seen for the
  # first time in a connection.
  monitorFlags: "all"
  # -- Allow cluster external access to ClusterIP services.
  lbExternalClusterIP: false
  # @schema
  # type: [null, boolean]
  # @schema
  # -- (bool) Enable native IP masquerade support in eBPF
  # @default -- `false`
  #
  # Packet sent from a pod to an outside address will be masqueraded
  # ( The out put devices IPv4 address )
  # This prevents packets being droped by hosts with rp_filter set,
  # which checks if the source IP is accessible from the current interface.
  masquerade: false
  # @schema
  # type: [null, boolean]
  # @schema
  # -- (bool) Configure whether direct routing mode should route traffic via
  # host stack (true) or directly and more efficiently out of BPF (false) if
  # the kernel supports it. The latter has the implication that it will also
  # bypass netfilter in the host namespace.
  # @default -- `false`
  hostLegacyRouting: ~
  # @schema
  # type: [null, boolean]
  # @schema
  # -- (bool) Configure the eBPF-based TPROXY to reduce reliance on iptables rules
  # for implementing Layer 7 policy.
  # @default -- `false`
  tproxy: ~
  # @schema
  # type: [null, array]
  # @schema
  # -- (list) Configure explicitly allowed VLAN id's for bpf logic bypass.
  # [0] will allow all VLAN id's without any filtering.
  # @default -- `[]`
  vlanBypass: ~
  # -- (bool) Disable ExternalIP mitigation (CVE-2020-8554)
  # @default -- `false`
  disableExternalIPMitigation: false
  # -- (bool) Attach endpoint programs using tcx instead of legacy tc hooks on
  # supported kernels.
  # @default -- `true`
  enableTCX: true
  # -- (string) Mode for Pod devices for the core datapath (veth, netkit, netkit-l2, lb-only)
  # @default -- `veth`
  datapathMode: veth
# -- Enable BPF clock source probing for more efficient tick retrieval.
bpfClockProbe: false
# -- Clean all eBPF datapath state from the initContainer of the cilium-agent
# DaemonSet.
#
# WARNING: Use with care!
cleanBpfState: false
# -- Clean all local Cilium state from the initContainer of the cilium-agent
# DaemonSet. Implies cleanBpfState: true.
#
# WARNING: Use with care!
cleanState: false
# -- Wait for KUBE-PROXY-CANARY iptables rule to appear in "wait-for-kube-proxy"
# init container before launching cilium-agent.
# More context can be found in the commit message of below PR
# https://github.com/cilium/cilium/pull/20123
waitForKubeProxy: false
cni:
  # -- Install the CNI configuration and binary files into the filesystem.
  install: true
  # -- Remove the CNI configuration and binary files on agent shutdown. Enable this
  # if you're removing Cilium from the cluster. Disable this to prevent the CNI
  # configuration file from being removed during agent upgrade, which can cause
  # nodes to go unmanageable.
  uninstall: false
  # @schema
  # type: [null, string]
  # @schema
  # -- Configure chaining on top of other CNI plugins. Possible values:
  #  - none
  #  - aws-cni
  #  - flannel
  #  - generic-veth
  #  - portmap
  chainingMode: ~
  # @schema
  # type: [null, string]
  # @schema
  # -- A CNI network name in to which the Cilium plugin should be added as a chained plugin.
  # This will cause the agent to watch for a CNI network with this network name. When it is
  # found, this will be used as the basis for Cilium's CNI configuration file. If this is
  # set, it assumes a chaining mode of generic-veth. As a special case, a chaining mode
  # of aws-cni implies a chainingTarget of aws-cni.
  chainingTarget: ~
  # -- Make Cilium take ownership over the `/etc/cni/net.d` directory on the
  # node, renaming all non-Cilium CNI configurations to `*.cilium_bak`.
  # This ensures no Pods can be scheduled using other CNI plugins during Cilium
  # agent downtime.
  exclusive: true
  # -- Configure the log file for CNI logging with retention policy of 7 days.
  # Disable CNI file logging by setting this field to empty explicitly.
  logFile: /var/run/cilium/cilium-cni.log
  # -- Skip writing of the CNI configuration. This can be used if
  # writing of the CNI configuration is performed by external automation.
  customConf: false
  # -- Configure the path to the CNI configuration directory on the host.
  confPath: /etc/cni/net.d
  # -- Configure the path to the CNI binary directory on the host.
  binPath: /opt/cni/bin
  # -- Specify the path to a CNI config to read from on agent start.
  # This can be useful if you want to manage your CNI
  # configuration outside of a Kubernetes environment. This parameter is
  # mutually exclusive with the 'cni.configMap' parameter. The agent will
  # write this to 05-cilium.conflist on startup.
  # readCniConf: /host/etc/cni/net.d/05-sample.conflist.input

  # -- When defined, configMap will mount the provided value as ConfigMap and
  # interpret the cniConf variable as CNI configuration file and write it
  # when the agent starts up
  # configMap: cni-configuration

  # -- Configure the key in the CNI ConfigMap to read the contents of
  # the CNI configuration from.
  configMapKey: cni-config
  # -- Configure the path to where to mount the ConfigMap inside the agent pod.
  confFileMountPath: /tmp/cni-configuration
  # -- Configure the path to where the CNI configuration directory is mounted
  # inside the agent pod.
  hostConfDirMountPath: /host/etc/cni/net.d
  # -- Specifies the resources for the cni initContainer
  resources:
    requests:
      cpu: 100m
      memory: 10Mi
  # -- Enable route MTU for pod netns when CNI chaining is used
  enableRouteMTUForCNIChaining: false
# -- (string) Configure how frequently garbage collection should occur for the datapath
# connection tracking table.
# @default -- `"0s"`
conntrackGCInterval: ""
# -- (string) Configure the maximum frequency for the garbage collection of the
# connection tracking table. Only affects the automatic computation for the frequency
# and has no effect when 'conntrackGCInterval' is set. This can be set to more frequently
# clean up unused identities created from ToFQDN policies.
conntrackGCMaxInterval: ""
# -- (string) Configure timeout in which Cilium will exit if CRDs are not available
# @default -- `"5m"`
crdWaitTimeout: ""
# -- Tail call hooks for custom eBPF programs.
customCalls:
  # -- Enable tail call hooks for custom eBPF programs.
  enabled: false
daemon:
  # -- Configure where Cilium runtime state should be stored.
  runPath: "/var/run/cilium"
  # @schema
  # type: [null, string]
  # @schema
  # -- Configure a custom list of possible configuration override sources
  # The default is "config-map:cilium-config,cilium-node-config". For supported
  # values, see the help text for the build-config subcommand.
  # Note that this value should be a comma-separated string.
  configSources: ~
  # @schema
  # type: [null, string]
  # @schema
  # -- allowedConfigOverrides is a list of config-map keys that can be overridden.
  # That is to say, if this value is set, config sources (excepting the first one) can
  # only override keys in this list.
  #
  # This takes precedence over blockedConfigOverrides.
  #
  # By default, all keys may be overridden. To disable overrides, set this to "none" or
  # change the configSources variable.
  allowedConfigOverrides: ~
  # @schema
  # type: [null, string]
  # @schema
  # -- blockedConfigOverrides is a list of config-map keys that may not be overridden.
  # In other words, if any of these keys appear in a configuration source excepting the
  # first one, they will be ignored
  #
  # This is ignored if allowedConfigOverrides is set.
  #
  # By default, all keys may be overridden.
  blockedConfigOverrides: ~
# -- Specify which network interfaces can run the eBPF datapath. This means
# that a packet sent from a pod to a destination outside the cluster will be
# masqueraded (to an output device IPv4 address), if the output device runs the
# program. When not specified, probing will automatically detect devices that have
# a non-local route. This should be used only when autodetection is not suitable.
devices: "eno1"

# -- Enables experimental support for the detection of new and removed datapath
# devices. When devices change the eBPF datapath is reloaded and services updated.
# If "devices" is set then only those devices, or devices matching a wildcard will
# be considered.
#
# This option has been deprecated and is a no-op.
enableRuntimeDeviceDetection: true
# -- Forces the auto-detection of devices, even if specific devices are explicitly listed
forceDeviceDetection: false
# -- Chains to ignore when installing feeder rules.
# disableIptablesFeederRules: ""

# -- Limit iptables-based egress masquerading to interface selector.
# egressMasqueradeInterfaces: ""

# -- Enable setting identity mark for local traffic.
# enableIdentityMark: true

# -- Enable Kubernetes EndpointSlice feature in Cilium if the cluster supports it.
# enableK8sEndpointSlice: true

# -- Enable CiliumEndpointSlice feature (deprecated, please use `ciliumEndpointSlice.enabled` instead).
enableCiliumEndpointSlice: false
ciliumEndpointSlice:
  # -- Enable Cilium EndpointSlice feature.
  enabled: false
  # -- List of rate limit options to be used for the CiliumEndpointSlice controller.
  # Each object in the list must have the following fields:
  # nodes: Count of nodes at which to apply the rate limit.
  # limit: The sustained request rate in requests per second. The maximum rate that can be configured is 50.
  # burst: The burst request rate in requests per second. The maximum burst that can be configured is 100.
  rateLimits:
    - nodes: 0
      limit: 10
      burst: 20
    - nodes: 100
      limit: 7
      burst: 15
    - nodes: 500
      limit: 5
      burst: 10
envoyConfig:
  # -- Enable CiliumEnvoyConfig CRD
  # CiliumEnvoyConfig CRD can also be implicitly enabled by other options.
  enabled: false
  # -- SecretsNamespace is the namespace in which envoy SDS will retrieve secrets from.
  secretsNamespace:
    # -- Create secrets namespace for CiliumEnvoyConfig CRDs.
    create: true
    # -- The name of the secret namespace to which Cilium agents are given read access.
    name: cilium-secrets
  # -- Interval in which an attempt is made to reconcile failed EnvoyConfigs. If the duration is zero, the retry is deactivated.
  retryInterval: 15s
ingressController:
  # -- Enable cilium ingress controller
  # This will automatically set enable-envoy-config as well.
  enabled: false
  # -- Set cilium ingress controller to be the default ingress controller
  # This will let cilium ingress controller route entries without ingress class set
  default: false
  # -- Default ingress load balancer mode
  # Supported values: shared, dedicated
  # For granular control, use the following annotations on the ingress resource:
  # "ingress.cilium.io/loadbalancer-mode: dedicated" (or "shared").
  loadbalancerMode: dedicated
  # -- Enforce https for host having matching TLS host in Ingress.
  # Incoming traffic to http listener will return 308 http error code with respective location in header.
  enforceHttps: true
  # -- Enable proxy protocol for all Ingress listeners. Note that _only_ Proxy protocol traffic will be accepted once this is enabled.
  enableProxyProtocol: false
  # -- IngressLBAnnotations are the annotation and label prefixes, which are used to filter annotations and/or labels to propagate from Ingress to the Load Balancer service
  ingressLBAnnotationPrefixes: ['lbipam.cilium.io', 'nodeipam.cilium.io', 'service.beta.kubernetes.io', 'service.kubernetes.io', 'cloud.google.com']
  # @schema
  # type: [null, string]
  # @schema
  # -- Default secret namespace for ingresses without .spec.tls[].secretName set.
  defaultSecretNamespace:
  # @schema
  # type: [null, string]
  # @schema
  # -- Default secret name for ingresses without .spec.tls[].secretName set.
  defaultSecretName:
  # -- SecretsNamespace is the namespace in which envoy SDS will retrieve TLS secrets from.
  secretsNamespace:
    # -- Create secrets namespace for Ingress.
    create: true
    # -- Name of Ingress secret namespace.
    name: cilium-secrets
    # -- Enable secret sync, which will make sure all TLS secrets used by Ingress are synced to secretsNamespace.name.
    # If disabled, TLS secrets must be maintained externally.
    sync: true
  # -- Load-balancer service in shared mode.
  # This is a single load-balancer service for all Ingress resources.
  service:
    # -- Service name
    name: cilium-ingress
    # -- Labels to be added for the shared LB service
    labels: {}
    # -- Annotations to be added for the shared LB service
    annotations: {}
    # -- Service type for the shared LB service
    type: LoadBalancer
    # @schema
    # type: [null, integer]
    # @schema
    # -- Configure a specific nodePort for insecure HTTP traffic on the shared LB service
    insecureNodePort: ~
    # @schema
    # type: [null, integer]
    # @schema
    # -- Configure a specific nodePort for secure HTTPS traffic on the shared LB service
    secureNodePort: ~
    # @schema
    # type: [null, string]
    # @schema
    # -- Configure a specific loadBalancerClass on the shared LB service (requires Kubernetes 1.24+)
    loadBalancerClass: ~
    # @schema
    # type: [null, string]
    # @schema
    # -- Configure a specific loadBalancerIP on the shared LB service
    loadBalancerIP: ~
    # @schema
    # type: [null, boolean]
    # @schema
    # -- Configure if node port allocation is required for LB service
    # ref: https://kubernetes.io/docs/concepts/services-networking/service/#load-balancer-nodeport-allocation
    allocateLoadBalancerNodePorts: ~
    # -- Control how traffic from external sources is routed to the LoadBalancer Kubernetes Service for Cilium Ingress in shared mode.
    # Valid values are "Cluster" and "Local".
    # ref: https://kubernetes.io/docs/reference/networking/virtual-ips/#external-traffic-policy
    externalTrafficPolicy: Cluster
  # Host Network related configuration
  hostNetwork:
    # -- Configure whether the Envoy listeners should be exposed on the host network.
    enabled: false
    # -- Configure a specific port on the host network that gets used for the shared listener.
    sharedListenerPort: 8080
    # Specify the nodes where the Ingress listeners should be exposed
    nodes:
      # -- Specify the labels of the nodes where the Ingress listeners should be exposed
      #
      # matchLabels:
      #   kubernetes.io/os: linux
      #   kubernetes.io/hostname: kind-worker
      matchLabels: {}
gatewayAPI:
  # -- Enable support for Gateway API in cilium
  # This will automatically set enable-envoy-config as well.
  enabled: false
  # -- Enable proxy protocol for all GatewayAPI listeners. Note that _only_ Proxy protocol traffic will be accepted once this is enabled.
  enableProxyProtocol: false
  # -- Enable Backend Protocol selection support (GEP-1911) for Gateway API via appProtocol.
  enableAppProtocol: false
  # -- Enable ALPN for all listeners configured with Gateway API. ALPN will attempt HTTP/2, then HTTP 1.1.
  # Note that this will also enable `appProtocol` support, and services that wish to use HTTP/2 will need to indicate that via their `appProtocol`.
  enableAlpn: false
  # -- The number of additional GatewayAPI proxy hops from the right side of the HTTP header to trust when determining the origin client's IP address.
  xffNumTrustedHops: 0
  # -- Control how traffic from external sources is routed to the LoadBalancer Kubernetes Service for all Cilium GatewayAPI Gateway instances. Valid values are "Cluster" and "Local".
  # Note that this value will be ignored when `hostNetwork.enabled == true`.
  # ref: https://kubernetes.io/docs/reference/networking/virtual-ips/#external-traffic-policy
  externalTrafficPolicy: Cluster
  gatewayClass:
    # -- Enable creation of GatewayClass resource
    # The default value is 'auto' which decides according to presence of gateway.networking.k8s.io/v1/GatewayClass in the cluster.
    # Other possible values are 'true' and 'false', which will either always or never create the GatewayClass, respectively.
    create: auto
  # -- SecretsNamespace is the namespace in which envoy SDS will retrieve TLS secrets from.
  secretsNamespace:
    # -- Create secrets namespace for Gateway API.
    create: true
    # -- Name of Gateway API secret namespace.
    name: cilium-secrets
    # -- Enable secret sync, which will make sure all TLS secrets used by Ingress are synced to secretsNamespace.name.
    # If disabled, TLS secrets must be maintained externally.
    sync: true
  # Host Network related configuration
  hostNetwork:
    # -- Configure whether the Envoy listeners should be exposed on the host network.
    enabled: false
    # Specify the nodes where the Ingress listeners should be exposed
    nodes:
      # -- Specify the labels of the nodes where the Ingress listeners should be exposed
      #
      # matchLabels:
      #   kubernetes.io/os: linux
      #   kubernetes.io/hostname: kind-worker
      matchLabels: {}
# -- Enables the fallback compatibility solution for when the xt_socket kernel
# module is missing and it is needed for the datapath L7 redirection to work
# properly. See documentation for details on when this can be disabled:
# https://docs.cilium.io/en/stable/operations/system_requirements/#linux-kernel.
enableXTSocketFallback: true
encryption:
  # -- Enable transparent network encryption.
  enabled: false
  # -- Encryption method. Can be either ipsec or wireguard.
  type: ipsec
  # -- Enable encryption for pure node to node traffic.
  # This option is only effective when encryption.type is set to "wireguard".
  nodeEncryption: false
  # -- Configure the WireGuard Pod2Pod strict mode.
  strictMode:
    # -- Enable WireGuard Pod2Pod strict mode.
    enabled: false
    # -- CIDR for the WireGuard Pod2Pod strict mode.
    cidr: ""
    # -- Allow dynamic lookup of remote node identities.
    # This is required when tunneling is used or direct routing is used and the node CIDR and pod CIDR overlap.
    allowRemoteNodeIdentities: false
  ipsec:
    # -- Name of the key file inside the Kubernetes secret configured via secretName.
    keyFile: keys
    # -- Path to mount the secret inside the Cilium pod.
    mountPath: /etc/ipsec
    # -- Name of the Kubernetes secret containing the encryption keys.
    secretName: cilium-ipsec-keys
    # -- The interface to use for encrypted traffic.
    interface: ""
    # -- Enable the key watcher. If disabled, a restart of the agent will be
    # necessary on key rotations.
    keyWatcher: true
    # -- Maximum duration of the IPsec key rotation. The previous key will be
    # removed after that delay.
    keyRotationDuration: "5m"
    # -- Enable IPsec encrypted overlay
    encryptedOverlay: false
  wireguard:
    # -- Enables the fallback to the user-space implementation (deprecated).
    userspaceFallback: false
    # -- Controls WireGuard PersistentKeepalive option. Set 0s to disable.
    persistentKeepalive: 0s
endpointHealthChecking:
  # -- Enable connectivity health checking between virtual endpoints.
  enabled: true
endpointRoutes:
  # @schema
  # type: [boolean, string]
  # @schema
  # -- Enable use of per endpoint routes instead of routing via
  # the cilium_host interface.
  enabled: false
k8sNetworkPolicy:
  # -- Enable support for K8s NetworkPolicy
  enabled: true
eni:
  # -- Enable Elastic Network Interface (ENI) integration.
  enabled: false
  # -- Update ENI Adapter limits from the EC2 API
  updateEC2AdapterLimitViaAPI: true
  # -- Release IPs not used from the ENI
  awsReleaseExcessIPs: false
  # -- Enable ENI prefix delegation
  awsEnablePrefixDelegation: false
  # -- EC2 API endpoint to use
  ec2APIEndpoint: ""
  # -- Tags to apply to the newly created ENIs
  eniTags: {}
  # -- Interval for garbage collection of unattached ENIs. Set to "0s" to disable.
  # @default -- `"5m"`
  gcInterval: ""
  # -- Additional tags attached to ENIs created by Cilium.
  # Dangling ENIs with this tag will be garbage collected
  # @default -- `{"io.cilium/cilium-managed":"true,"io.cilium/cluster-name":"<auto-detected>"}`
  gcTags: {}
  # -- If using IAM role for Service Accounts will not try to
  # inject identity values from cilium-aws kubernetes secret.
  # Adds annotation to service account if managed by Helm.
  # See https://github.com/aws/amazon-eks-pod-identity-webhook
  iamRole: ""
  # -- Filter via subnet IDs which will dictate which subnets are going to be used to create new ENIs
  # Important note: This requires that each instance has an ENI with a matching subnet attached
  # when Cilium is deployed. If you only want to control subnets for ENIs attached by Cilium,
  # use the CNI configuration file settings (cni.customConf) instead.
  subnetIDsFilter: []
  # -- Filter via tags (k=v) which will dictate which subnets are going to be used to create new ENIs
  # Important note: This requires that each instance has an ENI with a matching subnet attached
  # when Cilium is deployed. If you only want to control subnets for ENIs attached by Cilium,
  # use the CNI configuration file settings (cni.customConf) instead.
  subnetTagsFilter: []
  # -- Filter via AWS EC2 Instance tags (k=v) which will dictate which AWS EC2 Instances
  # are going to be used to create new ENIs
  instanceTagsFilter: []
externalIPs:
  # -- Enable ExternalIPs service support.
  enabled: true
# fragmentTracking enables IPv4 fragment tracking support in the datapath.
# fragmentTracking: true
gke:
  # -- Enable Google Kubernetes Engine integration
  enabled: false
# -- Enable connectivity health checking.
healthChecking: true
# -- TCP port for the agent health API. This is not the port for cilium-health.
healthPort: 9879
# -- Configure the host firewall.
hostFirewall:
  # -- Enables the enforcement of host policies in the eBPF datapath.
  enabled: false
hostPort:
  # -- Enable hostPort service support.
  enabled: false
# -- Configure socket LB
socketLB:
  # -- Enable socket LB
  enabled: false
  # -- Disable socket lb for non-root ns. This is used to enable Istio routing rules.
  # hostNamespaceOnly: false
  # -- Enable terminating pod connections to deleted service backends.
  # terminatePodConnections: true
# -- Configure certificate generation for Hubble integration.
# If hubble.tls.auto.method=cronJob, these values are used
# for the Kubernetes CronJob which will be scheduled regularly to
# (re)generate any certificates not provided manually.
certgen:
  image:
    # @schema
    # type: [null, string]
    # @schema
    override: ~
    repository: "quay.io/cilium/certgen"
    tag: "v0.2.0"
    digest: "sha256:169d93fd8f2f9009db3b9d5ccd37c2b753d0989e1e7cd8fe79f9160c459eef4f"
    useDigest: true
    pullPolicy: "IfNotPresent"
  # -- Seconds after which the completed job pod will be deleted
  ttlSecondsAfterFinished: 1800
  # -- Labels to be added to hubble-certgen pods
  podLabels: {}
  # -- Annotations to be added to the hubble-certgen initial Job and CronJob
  annotations:
    job: {}
    cronJob: {}
  # -- Node tolerations for pod assignment on nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations: []
  # -- Additional certgen volumes.
  extraVolumes: []
  # -- Additional certgen volumeMounts.
  extraVolumeMounts: []
  # -- Affinity for certgen
  affinity: {}
hubble:
  # -- Enable Hubble (true by default).
  enabled: true
  # -- Annotations to be added to all top-level hubble objects (resources under templates/hubble)
  annotations: {}
  # -- Buffer size of the channel Hubble uses to receive monitor events. If this
  # value is not set, the queue size is set to the default monitor queue size.
  # eventQueueSize: ""

  # -- Number of recent flows for Hubble to cache. Defaults to 4095.
  # Possible values are:
  #   1, 3, 7, 15, 31, 63, 127, 255, 511, 1023,
  #   2047, 4095, 8191, 16383, 32767, 65535
  # eventBufferCapacity: "4095"

  # -- Hubble metrics configuration.
  # See https://docs.cilium.io/en/stable/observability/metrics/#hubble-metrics
  # for more comprehensive documentation about Hubble metrics.
  metrics:
    # @schema
    # type: [null, array]
    # @schema
    # -- Configures the list of metrics to collect. If empty or null, metrics
    # are disabled.
    # Example:
    #
    #   enabled:
    #   - dns:query;ignoreAAAA
    #   - drop
    #   - tcp
    #   - flow
    #   - icmp
    #   - http
    #
    # You can specify the list of metrics from the helm CLI:
    #
    #   --set hubble.metrics.enabled="{dns:query;ignoreAAAA,drop,tcp,flow,icmp,http}"
    #
    enabled: ~
    # -- Enables exporting hubble metrics in OpenMetrics format.
    enableOpenMetrics: false
    # -- Configure the port the hubble metric server listens on.
    port: 9965
    tls:
      # Enable hubble metrics server TLS.
      enabled: false
      # Configure hubble metrics server TLS.
      server:
        # -- base64 encoded PEM values for the Hubble metrics server certificate.
        cert: ""
        # -- base64 encoded PEM values for the Hubble metrics server key.
        key: ""
        # -- Extra DNS names added to certificate when it's auto generated
        extraDnsNames: []
        # -- Extra IP addresses added to certificate when it's auto generated
        extraIpAddresses: []
        # -- Configure mTLS for the Hubble metrics server.
        mtls:
          # When set to true enforces mutual TLS between Hubble Metrics server and its clients.
          # False allow non-mutual TLS connections.
          # This option has no effect when TLS is disabled.
          enabled: false
          useSecret: false
          # -- Name of the ConfigMap containing the CA to validate client certificates against.
          # If mTLS is enabled and this is unspecified, it will default to the
          # same CA used for Hubble metrics server certificates.
          name: ~
          # -- Entry of the ConfigMap containing the CA.
          key: ca.crt
    # -- Annotations to be added to hubble-metrics service.
    serviceAnnotations: {}
    serviceMonitor:
      # -- Create ServiceMonitor resources for Prometheus Operator.
      # This requires the prometheus CRDs to be available.
      # ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
      enabled: false
      # -- Labels to add to ServiceMonitor hubble
      labels: {}
      # -- Annotations to add to ServiceMonitor hubble
      annotations: {}
      # -- jobLabel to add for ServiceMonitor hubble
      jobLabel: ""
      # -- Interval for scrape metrics.
      interval: "10s"
      # -- Relabeling configs for the ServiceMonitor hubble
      relabelings:
        - sourceLabels:
            - __meta_kubernetes_pod_node_name
          targetLabel: node
          replacement: ${1}
      # @schema
      # type: [null, array]
      # @schema
      # -- Metrics relabeling configs for the ServiceMonitor hubble
      metricRelabelings: ~
      # Configure TLS for the ServiceMonitor.
      # Note, when using TLS you will either need to specify
      # tlsConfig.insecureSkipVerify or specify a CA to use.
      tlsConfig: {}
    # -- Grafana dashboards for hubble
    # grafana can import dashboards based on the label and value
    # ref: https://github.com/grafana/helm-charts/tree/main/charts/grafana#sidecar-for-dashboards
    dashboards:
      enabled: false
      label: grafana_dashboard
      # @schema
      # type: [null, string]
      # @schema
      namespace: ~
      labelValue: "1"
      annotations: {}
  # -- Unix domain socket path to listen to when Hubble is enabled.
  socketPath: /var/run/cilium/hubble.sock
  # -- Enables redacting sensitive information present in Layer 7 flows.
  redact:
    enabled: true
    http:
      # -- Enables redacting URL query (GET) parameters.
      # Example:
      #
      #   redact:
      #     enabled: true
      #     http:
      #       urlQuery: true
      #
      # You can specify the options from the helm CLI:
      #
      #   --set hubble.redact.enabled="true"
      #   --set hubble.redact.http.urlQuery="true"
      urlQuery: false
      # -- Enables redacting user info, e.g., password when basic auth is used.
      # Example:
      #
      #   redact:
      #     enabled: true
      #     http:
      #       userInfo: true
      #
      # You can specify the options from the helm CLI:
      #
      #   --set hubble.redact.enabled="true"
      #   --set hubble.redact.http.userInfo="true"
      userInfo: true
      headers:
        # -- List of HTTP headers to allow: headers not matching will be redacted. Note: `allow` and `deny` lists cannot be used both at the same time, only one can be present.
        # Example:
        #   redact:
        #     enabled: true
        #     http:
        #       headers:
        #         allow:
        #           - traceparent
        #           - tracestate
        #           - Cache-Control
        #
        # You can specify the options from the helm CLI:
        #   --set hubble.redact.enabled="true"
        #   --set hubble.redact.http.headers.allow="traceparent,tracestate,Cache-Control"
        allow: []
        # -- List of HTTP headers to deny: matching headers will be redacted. Note: `allow` and `deny` lists cannot be used both at the same time, only one can be present.
        # Example:
        #   redact:
        #     enabled: true
        #     http:
        #       headers:
        #         deny:
        #           - Authorization
        #           - Proxy-Authorization
        #
        # You can specify the options from the helm CLI:
        #   --set hubble.redact.enabled="true"
        #   --set hubble.redact.http.headers.deny="Authorization,Proxy-Authorization"
        deny: []
    kafka:
      # -- Enables redacting Kafka's API key.
      # Example:
      #
      #   redact:
      #     enabled: true
      #     kafka:
      #       apiKey: true
      #
      # You can specify the options from the helm CLI:
      #
      #   --set hubble.redact.enabled="true"
      #   --set hubble.redact.kafka.apiKey="true"
      apiKey: false
  # -- An additional address for Hubble to listen to.
  # Set this field ":4244" if you are enabling Hubble Relay, as it assumes that
  # Hubble is listening on port 4244.
  listenAddress: ":4244"
  # -- Whether Hubble should prefer to announce IPv6 or IPv4 addresses if both are available.
  preferIpv6: false
  # @schema
  # type: [null, boolean]
  # @schema
  # -- (bool) Skip Hubble events with unknown cgroup ids
  # @default -- `true`
  skipUnknownCGroupIDs: ~
  peerService:
    # -- Service Port for the Peer service.
    # If not set, it is dynamically assigned to port 443 if TLS is enabled and to
    # port 80 if not.
    # servicePort: 80
    # -- Target Port for the Peer service, must match the hubble.listenAddress'
    # port.
    targetPort: 4244
    # -- The cluster domain to use to query the Hubble Peer service. It should
    # be the local cluster.
    clusterDomain: cluster.local
  # -- TLS configuration for Hubble
  tls:
    # -- Enable mutual TLS for listenAddress. Setting this value to false is
    # highly discouraged as the Hubble API provides access to potentially
    # sensitive network flow metadata and is exposed on the host network.
    enabled: true
    # -- Configure automatic TLS certificates generation.
    auto:
      # -- Auto-generate certificates.
      # When set to true, automatically generate a CA and certificates to
      # enable mTLS between Hubble server and Hubble Relay instances. If set to
      # false, the certs for Hubble server need to be provided by setting
      # appropriate values below.
      enabled: true
      # -- Set the method to auto-generate certificates. Supported values:
      # - helm:         This method uses Helm to generate all certificates.
      # - cronJob:      This method uses a Kubernetes CronJob the generate any
      #                 certificates not provided by the user at installation
      #                 time.
      # - certmanager:  This method use cert-manager to generate & rotate certificates.
      method: helm
      # -- Generated certificates validity duration in days.
      certValidityDuration: 1095
      # -- Schedule for certificates regeneration (regardless of their expiration date).
      # Only used if method is "cronJob". If nil, then no recurring job will be created.
      # Instead, only the one-shot job is deployed to generate the certificates at
      # installation time.
      #
      # Defaults to midnight of the first day of every fourth month. For syntax, see
      # https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-syntax
      schedule: "0 0 1 */4 *"
      # [Example]
      # certManagerIssuerRef:
      #   group: cert-manager.io
      #   kind: ClusterIssuer
      #   name: ca-issuer
      # -- certmanager issuer used when hubble.tls.auto.method=certmanager.
      certManagerIssuerRef: {}
    # -- base64 encoded PEM values for the Hubble server certificate and private key
    server:
      cert: ""
      key: ""
      # -- Extra DNS names added to certificate when it's auto generated
      extraDnsNames: []
      # -- Extra IP addresses added to certificate when it's auto generated
      extraIpAddresses: []
  relay:
    # -- Enable Hubble Relay (requires hubble.enabled=true)
    enabled: true
    # -- Roll out Hubble Relay pods automatically when configmap is updated.
    rollOutPods: false
    # -- Hubble-relay container image.
    image:
      # @schema
      # type: [null, string]
      # @schema
      override: ~
      repository: "quay.io/cilium/hubble-relay"
      tag: "v1.16.0"
      # hubble-relay-digest
      digest: "sha256:33fca7776fc3d7b2abe08873319353806dc1c5e07e12011d7da4da05f836ce8d"
      useDigest: true
      pullPolicy: "IfNotPresent"
    # -- Specifies the resources for the hubble-relay pods
    resources: {}
    # -- Number of replicas run for the hubble-relay deployment.
    replicas: 1
    # -- Affinity for hubble-replay
    affinity:
      podAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          - topologyKey: kubernetes.io/hostname
            labelSelector:
              matchLabels:
                k8s-app: cilium
    # -- Pod topology spread constraints for hubble-relay
    topologySpreadConstraints: []
    #   - maxSkew: 1
    #     topologyKey: topology.kubernetes.io/zone
    #     whenUnsatisfiable: DoNotSchedule

    # -- Node labels for pod assignment
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
    nodeSelector:
      kubernetes.io/os: linux
    # -- Node tolerations for pod assignment on nodes with taints
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
    tolerations: []
    # -- Additional hubble-relay environment variables.
    extraEnv: []
    # -- Annotations to be added to all top-level hubble-relay objects (resources under templates/hubble-relay)
    annotations: {}
    # -- Annotations to be added to hubble-relay pods
    podAnnotations: {}
    # -- Labels to be added to hubble-relay pods
    podLabels: {}
    # PodDisruptionBudget settings
    podDisruptionBudget:
      # -- enable PodDisruptionBudget
      # ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
      enabled: false
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Minimum number/percentage of pods that should remain scheduled.
      # When it's set, maxUnavailable must be disabled by `maxUnavailable: null`
      minAvailable: null
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Maximum number/percentage of pods that may be made unavailable
      maxUnavailable: 1
    # -- The priority class to use for hubble-relay
    priorityClassName: ""
    # -- Configure termination grace period for hubble relay Deployment.
    terminationGracePeriodSeconds: 1
    # -- hubble-relay update strategy
    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        # @schema
        # type: [integer, string]
        # @schema
        maxUnavailable: 1
    # -- Additional hubble-relay volumes.
    extraVolumes: []
    # -- Additional hubble-relay volumeMounts.
    extraVolumeMounts: []
    # -- hubble-relay pod security context
    podSecurityContext:
      fsGroup: 65532
    # -- hubble-relay container security context
    securityContext:
      # readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 65532
      runAsGroup: 65532
      capabilities:
        drop:
          - ALL
    # -- hubble-relay service configuration.
    service:
      # --- The type of service used for Hubble Relay access, either ClusterIP or NodePort.
      type: ClusterIP
      # --- The port to use when the service type is set to NodePort.
      nodePort: 31234
    # -- Host to listen to. Specify an empty string to bind to all the interfaces.
    listenHost: ""
    # -- Port to listen to.
    listenPort: "4245"
    # -- TLS configuration for Hubble Relay
    tls:
      # -- base64 encoded PEM values for the hubble-relay client certificate and private key
      # This keypair is presented to Hubble server instances for mTLS
      # authentication and is required when hubble.tls.enabled is true.
      # These values need to be set manually if hubble.tls.auto.enabled is false.
      client:
        cert: ""
        key: ""
      # -- base64 encoded PEM values for the hubble-relay server certificate and private key
      server:
        # When set to true, enable TLS on for Hubble Relay server
        # (ie: for clients connecting to the Hubble Relay API).
        enabled: false
        # When set to true enforces mutual TLS between Hubble Relay server and its clients.
        # False allow non-mutual TLS connections.
        # This option has no effect when TLS is disabled.
        mtls: false
        # These values need to be set manually if hubble.tls.auto.enabled is false.
        cert: ""
        key: ""
        # -- extra DNS names added to certificate when its auto gen
        extraDnsNames: []
        # -- extra IP addresses added to certificate when its auto gen
        extraIpAddresses: []
        # DNS name used by the backend to connect to the relay
        # This is a simple workaround as the relay certificates are currently hardcoded to
        # *.hubble-relay.cilium.io
        # See https://github.com/cilium/cilium/pull/28709#discussion_r1371792546
        # For GKE Dataplane V2 this should be set to relay.kube-system.svc.cluster.local
        relayName: "ui.hubble-relay.cilium.io"
    # @schema
    # type: [null, string]
    # @schema
    # -- Dial timeout to connect to the local hubble instance to receive peer information (e.g. "30s").
    dialTimeout: ~
    # @schema
    # type: [null, string]
    # @schema
    # -- Backoff duration to retry connecting to the local hubble instance in case of failure (e.g. "30s").
    retryTimeout: ~
    # @schema
    # type: [null, integer]
    # @schema
    # -- (int) Max number of flows that can be buffered for sorting before being sent to the
    # client (per request) (e.g. 100).
    sortBufferLenMax: ~
    # @schema
    # type: [null, string]
    # @schema
    # -- When the per-request flows sort buffer is not full, a flow is drained every
    # time this timeout is reached (only affects requests in follow-mode) (e.g. "1s").
    sortBufferDrainTimeout: ~
    # -- Port to use for the k8s service backed by hubble-relay pods.
    # If not set, it is dynamically assigned to port 443 if TLS is enabled and to
    # port 80 if not.
    # servicePort: 80

    # -- Enable prometheus metrics for hubble-relay on the configured port at
    # /metrics
    prometheus:
      enabled: false
      port: 9966
      serviceMonitor:
        # -- Enable service monitors.
        # This requires the prometheus CRDs to be available (see https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
        enabled: false
        # -- Labels to add to ServiceMonitor hubble-relay
        labels: {}
        # -- Annotations to add to ServiceMonitor hubble-relay
        annotations: {}
        # -- Interval for scrape metrics.
        interval: "10s"
        # -- Specify the Kubernetes namespace where Prometheus expects to find
        # service monitors configured.
        # namespace: ""
        # @schema
        # type: [null, array]
        # @schema
        # -- Relabeling configs for the ServiceMonitor hubble-relay
        relabelings: ~
        # @schema
        # type: [null, array]
        # @schema
        # -- Metrics relabeling configs for the ServiceMonitor hubble-relay
        metricRelabelings: ~
    gops:
      # -- Enable gops for hubble-relay
      enabled: true
      # -- Configure gops listen port for hubble-relay
      port: 9893
    pprof:
      # -- Enable pprof for hubble-relay
      enabled: false
      # -- Configure pprof listen address for hubble-relay
      address: localhost
      # -- Configure pprof listen port for hubble-relay
      port: 6062
  ui:
    # -- Whether to enable the Hubble UI.
    enabled: true
    standalone:
      # -- When true, it will allow installing the Hubble UI only, without checking dependencies.
      # It is useful if a cluster already has cilium and Hubble relay installed and you just
      # want Hubble UI to be deployed.
      # When installed via helm, installing UI should be done via `helm upgrade` and when installed via the cilium cli, then `cilium hubble enable --ui`
      enabled: false
      tls:
        # -- When deploying Hubble UI in standalone, with tls enabled for Hubble relay, it is required
        # to provide a volume for mounting the client certificates.
        certsVolume: {}
        #   projected:
        #     defaultMode: 0400
        #     sources:
        #     - secret:
        #         name: hubble-ui-client-certs
        #         items:
        #         - key: tls.crt
        #           path: client.crt
        #         - key: tls.key
        #           path: client.key
        #         - key: ca.crt
        #           path: hubble-relay-ca.crt
    # -- Roll out Hubble-ui pods automatically when configmap is updated.
    rollOutPods: false
    tls:
      # -- base64 encoded PEM values used to connect to hubble-relay
      # This keypair is presented to Hubble Relay instances for mTLS
      # authentication and is required when hubble.relay.tls.server.enabled is true.
      # These values need to be set manually if hubble.tls.auto.enabled is false.
      client:
        cert: ""
        key: ""
    backend:
      # -- Hubble-ui backend image.
      image:
        # @schema
        # type: [null, string]
        # @schema
        override: ~
        repository: "quay.io/cilium/hubble-ui-backend"
        tag: "v0.13.1"
        digest: "sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b"
        useDigest: true
        pullPolicy: "IfNotPresent"
      # -- Hubble-ui backend security context.
      securityContext: {}
      # -- Additional hubble-ui backend environment variables.
      extraEnv: []
      # -- Additional hubble-ui backend volumes.
      extraVolumes: []
      # -- Additional hubble-ui backend volumeMounts.
      extraVolumeMounts: []
      livenessProbe:
        # -- Enable liveness probe for Hubble-ui backend (requires Hubble-ui 0.12+)
        enabled: false
      readinessProbe:
        # -- Enable readiness probe for Hubble-ui backend (requires Hubble-ui 0.12+)
        enabled: false
      # -- Resource requests and limits for the 'backend' container of the 'hubble-ui' deployment.
      resources: {}
      #   limits:
      #     cpu: 1000m
      #     memory: 1024M
      #   requests:
      #     cpu: 100m
      #     memory: 64Mi
    frontend:
      # -- Hubble-ui frontend image.
      image:
        # @schema
        # type: [null, string]
        # @schema
        override: ~
        repository: "quay.io/cilium/hubble-ui"
        tag: "v0.13.1"
        digest: "sha256:e2e9313eb7caf64b0061d9da0efbdad59c6c461f6ca1752768942bfeda0796c6"
        useDigest: true
        pullPolicy: "IfNotPresent"
      # -- Hubble-ui frontend security context.
      securityContext: {}
      # -- Additional hubble-ui frontend environment variables.
      extraEnv: []
      # -- Additional hubble-ui frontend volumes.
      extraVolumes: []
      # -- Additional hubble-ui frontend volumeMounts.
      extraVolumeMounts: []
      # -- Resource requests and limits for the 'frontend' container of the 'hubble-ui' deployment.
      resources: {}
      #   limits:
      #     cpu: 1000m
      #     memory: 1024M
      #   requests:
      #     cpu: 100m
      #     memory: 64Mi
      server:
        # -- Controls server listener for ipv6
        ipv6:
          enabled: true
    # -- The number of replicas of Hubble UI to deploy.
    replicas: 1
    # -- Annotations to be added to all top-level hubble-ui objects (resources under templates/hubble-ui)
    annotations: {}
    # -- Annotations to be added to hubble-ui pods
    podAnnotations: {}
    # -- Labels to be added to hubble-ui pods
    podLabels: {}
    # PodDisruptionBudget settings
    podDisruptionBudget:
      # -- enable PodDisruptionBudget
      # ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
      enabled: false
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Minimum number/percentage of pods that should remain scheduled.
      # When it's set, maxUnavailable must be disabled by `maxUnavailable: null`
      minAvailable: null
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Maximum number/percentage of pods that may be made unavailable
      maxUnavailable: 1
    # -- Affinity for hubble-ui
    affinity: {}
    # -- Pod topology spread constraints for hubble-ui
    topologySpreadConstraints: []
    #   - maxSkew: 1
    #     topologyKey: topology.kubernetes.io/zone
    #     whenUnsatisfiable: DoNotSchedule

    # -- Node labels for pod assignment
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
    nodeSelector:
      kubernetes.io/os: linux
    # -- Node tolerations for pod assignment on nodes with taints
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
    tolerations: []
    # -- The priority class to use for hubble-ui
    priorityClassName: ""
    # -- hubble-ui update strategy.
    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        # @schema
        # type: [integer, string]
        # @schema
        maxUnavailable: 1
    # -- Security context to be added to Hubble UI pods
    securityContext:
      runAsUser: 1001
      runAsGroup: 1001
      fsGroup: 1001
    # -- hubble-ui service configuration.
    service:
      # -- Annotations to be added for the Hubble UI service
      annotations: {}
      # --- The type of service used for Hubble UI access, either ClusterIP or NodePort.
      type: ClusterIP
      # --- The port to use when the service type is set to NodePort.
      nodePort: 31235
    # -- Defines base url prefix for all hubble-ui http requests.
    # It needs to be changed in case if ingress for hubble-ui is configured under some sub-path.
    # Trailing `/` is required for custom path, ex. `/service-map/`
    baseUrl: "/"
    # -- hubble-ui ingress configuration.
    ingress:
      enabled: false
      annotations: {}
      # kubernetes.io/ingress.class: nginx
      # kubernetes.io/tls-acme: "true"
      className: ""
      hosts:
        - chart-example.local
      labels: {}
      tls: []
      #  - secretName: chart-example-tls
      #    hosts:
      #      - chart-example.local
  # -- Hubble flows export.
  export:
    # --- Defines max file size of output file before it gets rotated.
    fileMaxSizeMb: 10
    # --- Defines max number of backup/rotated files.
    fileMaxBackups: 5
    # --- Static exporter configuration.
    # Static exporter is bound to agent lifecycle.
    static:
      enabled: false
      filePath: /var/run/cilium/hubble/events.log
      fieldMask: []
      # - time
      # - source
      # - destination
      # - verdict
      allowList: []
      # - '{"verdict":["DROPPED","ERROR"]}'
      denyList: []
      # - '{"source_pod":["kube-system/"]}'
      # - '{"destination_pod":["kube-system/"]}'
    # --- Dynamic exporters configuration.
    # Dynamic exporters may be reconfigured without a need of agent restarts.
    dynamic:
      enabled: false
      config:
        # ---- Name of configmap with configuration that may be altered to reconfigure exporters within a running agents.
        configMapName: cilium-flowlog-config
        # ---- True if helm installer should create config map.
        # Switch to false if you want to self maintain the file content.
        createConfigMap: true
        # ---- Exporters configuration in YAML format.
        content:
          - name: all
            fieldMask: []
            includeFilters: []
            excludeFilters: []
            filePath: "/var/run/cilium/hubble/events.log"
            #   - name: "test002"
            #     filePath: "/var/log/network/flow-log/pa/test002.log"
            #     fieldMask: ["source.namespace", "source.pod_name", "destination.namespace", "destination.pod_name", "verdict"]
            #     includeFilters:
            #     - source_pod: ["default/"]
            #       event_type:
            #       - type: 1
            #     - destination_pod: ["frontend/nginx-975996d4c-7hhgt"]
            #     excludeFilters: []
            #     end: "2023-10-09T23:59:59-07:00"
  # -- Emit v1.Events related to pods on detection of packet drops.
  #    This feature is alpha, please provide feedback at https://github.com/cilium/cilium/issues/33975.
  dropEventEmitter:
    enabled: false
    # --- Minimum time between emitting same events.
    interval: 2m
    # --- Drop reasons to emit events for.
    # ref: https://docs.cilium.io/en/stable/_api/v1/flow/README/#dropreason
    reasons:
      - auth_required
      - policy_denied
# -- Method to use for identity allocation (`crd` or `kvstore`).
identityAllocationMode: "crd"
# -- (string) Time to wait before using new identity on endpoint identity change.
# @default -- `"5s"`
identityChangeGracePeriod: ""
# -- Install Iptables rules to skip netfilter connection tracking on all pod
# traffic. This option is only effective when Cilium is running in direct
# routing and full KPR mode. Moreover, this option cannot be enabled when Cilium
# is running in a managed Kubernetes environment or in a chained CNI setup.
installNoConntrackIptablesRules: false
ipam:
  # -- Configure IP Address Management mode.
  # ref: https://docs.cilium.io/en/stable/network/concepts/ipam/
  mode: "cluster-pool"
  # -- Maximum rate at which the CiliumNode custom resource is updated.
  ciliumNodeUpdateRate: "15s"
  operator:
    # @schema
    # type: [array, string]
    # @schema
    # -- IPv4 CIDR list range to delegate to individual nodes for IPAM.
    clusterPoolIPv4PodCIDRList: ["10.222.0.0/16"]
    # -- IPv4 CIDR mask size to delegate to individual nodes for IPAM.
    clusterPoolIPv4MaskSize: 24
    # @schema
    # type: [array, string]
    # @schema
    # -- IPv6 CIDR list range to delegate to individual nodes for IPAM.
    clusterPoolIPv6PodCIDRList: ["fd00::/104"]
    # -- IPv6 CIDR mask size to delegate to individual nodes for IPAM.
    clusterPoolIPv6MaskSize: 120
    # -- IP pools to auto-create in multi-pool IPAM mode.
    autoCreateCiliumPodIPPools: {}
    #   default:
    #     ipv4:
    #       cidrs:
    #         - 10.10.0.0/8
    #       maskSize: 24
    #   other:
    #     ipv6:
    #       cidrs:
    #         - fd00:100::/80
    #       maskSize: 96
    # @schema
    # type: [null, integer]
    # @schema
    # -- (int) The maximum burst size when rate limiting access to external APIs.
    # Also known as the token bucket capacity.
    # @default -- `20`
    externalAPILimitBurstSize: ~
    # @schema
    # type: [null, number]
    # @schema
    # -- (float) The maximum queries per second when rate limiting access to
    # external APIs. Also known as the bucket refill rate, which is used to
    # refill the bucket up to the burst size capacity.
    # @default -- `4.0`
    externalAPILimitQPS: ~
nodeIPAM:
  # -- Configure Node IPAM
  # ref: https://docs.cilium.io/en/stable/network/node-ipam/
  enabled: false
# @schema
# type: [null, string]
# @schema
# -- The api-rate-limit option can be used to overwrite individual settings of the default configuration for rate limiting calls to the Cilium Agent API
apiRateLimit: ~
# -- Configure the eBPF-based ip-masq-agent
ipMasqAgent:
  enabled: false
# the config of nonMasqueradeCIDRs
  config:
    nonMasqueradeCIDRs:
      -  10.222.0.0/16
    masqLinkLocal: false
    masqLinkLocalIPv6: false

# iptablesLockTimeout defines the iptables "--wait" option when invoked from Cilium.
# iptablesLockTimeout: "5s"
ipv4:
  # -- Enable IPv4 support.
  enabled: true
ipv6:
  # -- Enable IPv6 support.
  enabled: false
# -- Configure Kubernetes specific configuration
k8s:
  # -- requireIPv4PodCIDR enables waiting for Kubernetes to provide the PodCIDR
  # range via the Kubernetes node resource
  requireIPv4PodCIDR: false
  # -- requireIPv6PodCIDR enables waiting for Kubernetes to provide the PodCIDR
  # range via the Kubernetes node resource
  requireIPv6PodCIDR: false
# -- Keep the deprecated selector labels when deploying Cilium DaemonSet.
keepDeprecatedLabels: false
# -- Keep the deprecated probes when deploying Cilium DaemonSet
keepDeprecatedProbes: false
startupProbe:
  # -- failure threshold of startup probe.
  # 105 x 2s translates to the old behaviour of the readiness probe (120s delay + 30 x 3s)
  failureThreshold: 105
  # -- interval between checks of the startup probe
  periodSeconds: 2
livenessProbe:
  # -- failure threshold of liveness probe
  failureThreshold: 10
  # -- interval between checks of the liveness probe
  periodSeconds: 30
readinessProbe:
  # -- failure threshold of readiness probe
  failureThreshold: 3
  # -- interval between checks of the readiness probe
  periodSeconds: 30
# -- Configure the kube-proxy replacement in Cilium BPF datapath
# Valid options are "true" or "false".
# ref: https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/
kubeProxyReplacement: "true"

# -- healthz server bind address for the kube-proxy replacement.
# To enable set the value to '0.0.0.0:10256' for all ipv4
# addresses and this '[::]:10256' for all ipv6 addresses.
# By default it is disabled.
kubeProxyReplacementHealthzBindAddr: ""
l2NeighDiscovery:
  # -- Enable L2 neighbor discovery in the agent
  enabled: true
  # -- Override the agent's default neighbor resolution refresh period.
  refreshPeriod: "30s"
# -- Enable Layer 7 network policy.
l7Proxy: true
# -- Enable Local Redirect Policy.
localRedirectPolicy: false
# To include or exclude matched resources from cilium identity evaluation
# labels: ""

# logOptions allows you to define logging options. eg:
# logOptions:
#   format: json

# -- Enables periodic logging of system load
logSystemLoad: false
# -- Configure maglev consistent hashing
maglev: {}
# -- tableSize is the size (parameter M) for the backend table of one
# service entry
# tableSize:

# -- hashSeed is the cluster-wide base64 encoded seed for the hashing
# hashSeed:

# -- Enables masquerading of IPv4 traffic leaving the node from endpoints.
enableIPv4Masquerade: true
# -- Enables masquerading of IPv6 traffic leaving the node from endpoints.
enableIPv6Masquerade: true
# -- Enables masquerading to the source of the route for traffic leaving the node from endpoints.
enableMasqueradeRouteSource: false
# -- Enables IPv4 BIG TCP support which increases maximum IPv4 GSO/GRO limits for nodes and pods
enableIPv4BIGTCP: false
# -- Enables IPv6 BIG TCP support which increases maximum IPv6 GSO/GRO limits for nodes and pods
enableIPv6BIGTCP: false
egressGateway:
  # -- Enables egress gateway to redirect and SNAT the traffic that leaves the
  # cluster.
  enabled: false
  # -- Time between triggers of egress gateway state reconciliations
  reconciliationTriggerInterval: 1s
  # -- Maximum number of entries in egress gateway policy map
  # maxPolicyEntries: 16384
vtep:
  # -- Enables VXLAN Tunnel Endpoint (VTEP) Integration (beta) to allow
  # Cilium-managed pods to talk to third party VTEP devices over Cilium tunnel.
  enabled: false
  # -- A space separated list of VTEP device endpoint IPs, for example "1.1.1.1  1.1.2.1"
  endpoint: ""
  # -- A space separated list of VTEP device CIDRs, for example "1.1.1.0/24 1.1.2.0/24"
  cidr: ""
  # -- VTEP CIDRs Mask that applies to all VTEP CIDRs, for example "255.255.255.0"
  mask: ""
  # -- A space separated list of VTEP device MAC addresses (VTEP MAC), for example "x:x:x:x:x:x  y:y:y:y:y:y:y"
  mac: ""
# -- (string) Allows to explicitly specify the IPv4 CIDR for native routing.
# When specified, Cilium assumes networking for this CIDR is preconfigured and
# hands traffic destined for that range to the Linux network stack without
# applying any SNAT.
# Generally speaking, specifying a native routing CIDR implies that Cilium can
# depend on the underlying networking stack to route packets to their
# destination. To offer a concrete example, if Cilium is configured to use
# direct routing and the Kubernetes CIDR is included in the native routing CIDR,
# the user must configure the routes to reach pods, either manually or by
# setting the auto-direct-node-routes flag.
ipv4NativeRoutingCIDR: "10.222.0.0/16"
# -- (string) Allows to explicitly specify the IPv6 CIDR for native routing.
# When specified, Cilium assumes networking for this CIDR is preconfigured and
# hands traffic destined for that range to the Linux network stack without
# applying any SNAT.
# Generally speaking, specifying a native routing CIDR implies that Cilium can
# depend on the underlying networking stack to route packets to their
# destination. To offer a concrete example, if Cilium is configured to use
# direct routing and the Kubernetes CIDR is included in the native routing CIDR,
# the user must configure the routes to reach pods, either manually or by
# setting the auto-direct-node-routes flag.
ipv6NativeRoutingCIDR: ""
# -- cilium-monitor sidecar.
monitor:
  # -- Enable the cilium-monitor sidecar.
  enabled: false
# -- Configure service load balancing
loadBalancer:
  # -- standalone enables the standalone L4LB which does not connect to
  # kube-apiserver.
  # standalone: false

  # -- algorithm is the name of the load balancing algorithm for backend
  # selection e.g. random or maglev
  # algorithm: random

  # -- mode is the operation mode of load balancing for remote backends
  # e.g. snat, dsr, hybrid
  # mode: snat

  # -- acceleration is the option to accelerate service handling via XDP
  # Applicable values can be: disabled (do not use XDP), native (XDP BPF
  # program is run directly out of the networking driver's early receive
  # path), or best-effort (use native mode XDP acceleration on devices
  # that support it).
  acceleration: disabled
  # -- dsrDispatch configures whether IP option or IPIP encapsulation is
  # used to pass a service IP and port to remote backend
  # dsrDispatch: opt

  # -- serviceTopology enables K8s Topology Aware Hints -based service
  # endpoints filtering
  # serviceTopology: false

  # -- L7 LoadBalancer
  l7:
    # -- Enable L7 service load balancing via envoy proxy.
    # The request to a k8s service, which has specific annotation e.g. service.cilium.io/lb-l7,
    # will be forwarded to the local backend proxy to be load balanced to the service endpoints.
    # Please refer to docs for supported annotations for more configuration.
    #
    # Applicable values:
    #   - envoy: Enable L7 load balancing via envoy proxy. This will automatically set enable-envoy-config as well.
    #   - disabled: Disable L7 load balancing by way of service annotation.
    backend: disabled
    # -- List of ports from service to be automatically redirected to above backend.
    # Any service exposing one of these ports will be automatically redirected.
    # Fine-grained control can be achieved by using the service annotation.
    ports: []
    # -- Default LB algorithm
    # The default LB algorithm to be used for services, which can be overridden by the
    # service annotation (e.g. service.cilium.io/lb-l7-algorithm)
    # Applicable values: round_robin, least_request, random
    algorithm: round_robin
# -- Configure N-S k8s service loadbalancing
nodePort:
  # -- Enable the Cilium NodePort service implementation.
  enabled: false
  # -- Port range to use for NodePort services.
  # range: "30000,32767"

  # @schema
  # type: [null, string, array]
  # @schema
  # -- List of CIDRs for choosing which IP addresses assigned to native devices are used for NodePort load-balancing.
  # By default this is empty and the first suitable, preferably private, IPv4 and IPv6 address assigned to each device is used.
  #
  # Example:
  #
  #   addresses: ["192.168.1.0/24", "2001::/64"]
  #
  addresses: ~
  # -- Set to true to prevent applications binding to service ports.
  bindProtection: true
  # -- Append NodePort range to ip_local_reserved_ports if clash with ephemeral
  # ports is detected.
  autoProtectPortRange: true
  # -- Enable healthcheck nodePort server for NodePort services
  enableHealthCheck: true
  # -- Enable access of the healthcheck nodePort on the LoadBalancerIP. Needs
  # EnableHealthCheck to be enabled
  enableHealthCheckLoadBalancerIP: false
# policyAuditMode: false

# -- The agent can be put into one of the three policy enforcement modes:
# default, always and never.
# ref: https://docs.cilium.io/en/stable/security/policy/intro/#policy-enforcement-modes
policyEnforcementMode: "default"
# @schema
# type: [null, string, array]
# @schema
# -- policyCIDRMatchMode is a list of entities that may be selected by CIDR selector.
# The possible value is "nodes".
policyCIDRMatchMode:
pprof:
  # -- Enable pprof for cilium-agent
  enabled: false
  # -- Configure pprof listen address for cilium-agent
  address: localhost
  # -- Configure pprof listen port for cilium-agent
  port: 6060
# -- Configure prometheus metrics on the configured port at /metrics
prometheus:
  enabled: false
  port: 9962
  serviceMonitor:
    # -- Enable service monitors.
    # This requires the prometheus CRDs to be available (see https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
    enabled: false
    # -- Labels to add to ServiceMonitor cilium-agent
    labels: {}
    # -- Annotations to add to ServiceMonitor cilium-agent
    annotations: {}
    # -- jobLabel to add for ServiceMonitor cilium-agent
    jobLabel: ""
    # -- Interval for scrape metrics.
    interval: "10s"
    # -- Specify the Kubernetes namespace where Prometheus expects to find
    # service monitors configured.
    # namespace: ""
    # -- Relabeling configs for the ServiceMonitor cilium-agent
    relabelings:
      - sourceLabels:
          - __meta_kubernetes_pod_node_name
        targetLabel: node
        replacement: ${1}
    # @schema
    # type: [null, array]
    # @schema
    # -- Metrics relabeling configs for the ServiceMonitor cilium-agent
    metricRelabelings: ~
    # -- Set to `true` and helm will not check for monitoring.coreos.com/v1 CRDs before deploying
    trustCRDsExist: false
  # @schema
  # type: [null, array]
  # @schema
  # -- Metrics that should be enabled or disabled from the default metric list.
  # The list is expected to be separated by a space. (+metric_foo to enable
  # metric_foo , -metric_bar to disable metric_bar).
  # ref: https://docs.cilium.io/en/stable/observability/metrics/
  metrics: ~
  # --- Enable controller group metrics for monitoring specific Cilium
  # subsystems. The list is a list of controller group names. The special
  # values of "all" and "none" are supported. The set of controller
  # group names is not guaranteed to be stable between Cilium versions.
  controllerGroupMetrics:
    - write-cni-file
    - sync-host-ips
    - sync-lb-maps-with-k8s-services
# -- Grafana dashboards for cilium-agent
# grafana can import dashboards based on the label and value
# ref: https://github.com/grafana/helm-charts/tree/main/charts/grafana#sidecar-for-dashboards
dashboards:
  enabled: false
  label: grafana_dashboard
  # @schema
  # type: [null, string]
  # @schema
  namespace: ~
  labelValue: "1"
  annotations: {}
# Configure Cilium Envoy options.
envoy:
  # @schema
  # type: [null, boolean]
  # @schema
  # -- Enable Envoy Proxy in standalone DaemonSet.
  # This field is enabled by default for new installation.
  # @default -- `true` for new installation
  enabled: ~
  # -- (int)
  # Set Envoy'--base-id' to use when allocating shared memory regions.
  # Only needs to be changed if multiple Envoy instances will run on the same node and may have conflicts. Supported values: 0 - 4294967295. Defaults to '0'
  baseID: 0
  log:
    # -- The format string to use for laying out the log message metadata of Envoy.
    format: "[%Y-%m-%d %T.%e][%t][%l][%n] [%g:%#] %v"
    # -- Path to a separate Envoy log file, if any. Defaults to /dev/stdout.
    path: ""
  # -- Time in seconds after which a TCP connection attempt times out
  connectTimeoutSeconds: 2
  # -- ProxyMaxRequestsPerConnection specifies the max_requests_per_connection setting for Envoy
  maxRequestsPerConnection: 0
  # -- Set Envoy HTTP option max_connection_duration seconds. Default 0 (disable)
  maxConnectionDurationSeconds: 0
  # -- Set Envoy upstream HTTP idle connection timeout seconds.
  # Does not apply to connections with pending requests. Default 60s
  idleTimeoutDurationSeconds: 60
  # -- Number of trusted hops regarding the x-forwarded-for and related HTTP headers for the ingress L7 policy enforcement Envoy listeners.
  xffNumTrustedHopsL7PolicyIngress: 0
  # -- Number of trusted hops regarding the x-forwarded-for and related HTTP headers for the egress L7 policy enforcement Envoy listeners.
  xffNumTrustedHopsL7PolicyEgress: 0
  # -- Envoy container image.
  image:
    # @schema
    # type: [null, string]
    # @schema
    override: ~
    repository: "quay.io/cilium/cilium-envoy"
    tag: "v1.29.7-39a2a56bbd5b3a591f69dbca51d3e30ef97e0e51"
    pullPolicy: "IfNotPresent"
    digest: "sha256:bd5ff8c66716080028f414ec1cb4f7dc66f40d2fb5a009fff187f4a9b90b566b"
    useDigest: true
  # -- Additional containers added to the cilium Envoy DaemonSet.
  extraContainers: []
  # -- Additional envoy container arguments.
  extraArgs: []
  # -- Additional envoy container environment variables.
  extraEnv: []
  # -- Additional envoy hostPath mounts.
  extraHostPathMounts: []
  # - name: host-mnt-data
  #   mountPath: /host/mnt/data
  #   hostPath: /mnt/data
  #   hostPathType: Directory
  #   readOnly: true
  #   mountPropagation: HostToContainer

  # -- Additional envoy volumes.
  extraVolumes: []
  # -- Additional envoy volumeMounts.
  extraVolumeMounts: []
  # -- Configure termination grace period for cilium-envoy DaemonSet.
  terminationGracePeriodSeconds: 1
  # -- TCP port for the health API.
  healthPort: 9878
  # -- cilium-envoy update strategy
  # ref: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#updating-a-daemonset
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      # @schema
      # type: [integer, string]
      # @schema
      maxUnavailable: 2
  # -- Roll out cilium envoy pods automatically when configmap is updated.
  rollOutPods: false
  # -- Annotations to be added to all top-level cilium-envoy objects (resources under templates/cilium-envoy)
  annotations: {}
  # -- Security Context for cilium-envoy pods.
  podSecurityContext:
    # -- AppArmorProfile options for the `cilium-agent` and init containers
    appArmorProfile:
      type: "Unconfined"
  # -- Annotations to be added to envoy pods
  podAnnotations: {}
  # -- Labels to be added to envoy pods
  podLabels: {}
  # -- Envoy resource limits & requests
  # ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
  resources: {}
  #   limits:
  #     cpu: 4000m
  #     memory: 4Gi
  #   requests:
  #     cpu: 100m
  #     memory: 512Mi

  startupProbe:
    # -- failure threshold of startup probe.
    # 105 x 2s translates to the old behaviour of the readiness probe (120s delay + 30 x 3s)
    failureThreshold: 105
    # -- interval between checks of the startup probe
    periodSeconds: 2
  livenessProbe:
    # -- failure threshold of liveness probe
    failureThreshold: 10
    # -- interval between checks of the liveness probe
    periodSeconds: 30
  readinessProbe:
    # -- failure threshold of readiness probe
    failureThreshold: 3
    # -- interval between checks of the readiness probe
    periodSeconds: 30
  securityContext:
    # -- User to run the pod with
    # runAsUser: 0
    # -- Run the pod with elevated privileges
    privileged: false
    # -- SELinux options for the `cilium-envoy` container
    seLinuxOptions:
      level: 's0'
      # Running with spc_t since we have removed the privileged mode.
      # Users can change it to a different type as long as they have the
      # type available on the system.
      type: 'spc_t'
    capabilities:
      # -- Capabilities for the `cilium-envoy` container.
      # Even though granted to the container, the cilium-envoy-starter wrapper drops
      # all capabilities after forking the actual Envoy process.
      # `NET_BIND_SERVICE` is the only capability that can be passed to the Envoy process by
      # setting `envoy.securityContext.capabilities.keepNetBindService=true` (in addition to granting the
      # capability to the container).
      # Note: In case of embedded envoy, the capability must  be granted to the cilium-agent container.
      envoy:
        # Used since cilium proxy uses setting IPPROTO_IP/IP_TRANSPARENT
        - NET_ADMIN
        # We need it for now but might not need it for >= 5.11 specially
        # for the 'SYS_RESOURCE'.
        # In >= 5.8 there's already BPF and PERMON capabilities
        - SYS_ADMIN
        # Both PERFMON and BPF requires kernel 5.8, container runtime
        # cri-o >= v1.22.0 or containerd >= v1.5.0.
        # If available, SYS_ADMIN can be removed.
        #- PERFMON
        #- BPF
      # -- Keep capability `NET_BIND_SERVICE` for Envoy process.
      keepCapNetBindService: false
  # -- Affinity for cilium-envoy.
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              k8s-app: cilium-envoy
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              k8s-app: cilium
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: cilium.io/no-schedule
                operator: NotIn
                values:
                  - "true"
  # -- Node selector for cilium-envoy.
  nodeSelector:
    kubernetes.io/os: linux
  # -- Node tolerations for envoy scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - operator: Exists
      # - key: "key"
      #   operator: "Equal|Exists"
      #   value: "value"
      #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
  # @schema
  # type: [null, string]
  # @schema
  # -- The priority class to use for cilium-envoy.
  priorityClassName: ~
  # @schema
  # type: [null, string]
  # @schema
  # -- DNS policy for Cilium envoy pods.
  # Ref: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
  dnsPolicy: ~
  debug:
    admin:
      # -- Enable admin interface for cilium-envoy.
      # This is useful for debugging and should not be enabled in production.
      enabled: false
      # -- Port number (bound to loopback interface).
      # kubectl port-forward can be used to access the admin interface.
      port: 9901
  # -- Configure Cilium Envoy Prometheus options.
  # Note that some of these apply to either cilium-agent or cilium-envoy.
  prometheus:
    # -- Enable prometheus metrics for cilium-envoy
    enabled: true
    serviceMonitor:
      # -- Enable service monitors.
      # This requires the prometheus CRDs to be available (see https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
      # Note that this setting applies to both cilium-envoy _and_ cilium-agent
      # with Envoy enabled.
      enabled: false
      # -- Labels to add to ServiceMonitor cilium-envoy
      labels: {}
      # -- Annotations to add to ServiceMonitor cilium-envoy
      annotations: {}
      # -- Interval for scrape metrics.
      interval: "10s"
      # -- Specify the Kubernetes namespace where Prometheus expects to find
      # service monitors configured.
      # namespace: ""
      # -- Relabeling configs for the ServiceMonitor cilium-envoy
      # or for cilium-agent with Envoy configured.
      relabelings:
        - sourceLabels:
            - __meta_kubernetes_pod_node_name
          targetLabel: node
          replacement: ${1}
      # @schema
      # type: [null, array]
      # @schema
      # -- Metrics relabeling configs for the ServiceMonitor cilium-envoy
      # or for cilium-agent with Envoy configured.
      metricRelabelings: ~
    # -- Serve prometheus metrics for cilium-envoy on the configured port
    port: "9964"
# -- Enable/Disable use of node label based identity
nodeSelectorLabels: false
# -- Enable resource quotas for priority classes used in the cluster.
resourceQuotas:
  enabled: false
  cilium:
    hard:
      # 5k nodes * 2 DaemonSets (Cilium and cilium node init)
      pods: "10k"
  operator:
    hard:
      # 15 "clusterwide" Cilium Operator pods for HA
      pods: "15"
# Need to document default
##################
#sessionAffinity: false

# -- Do not run Cilium agent when running with clean mode. Useful to completely
# uninstall Cilium as it will stop Cilium from starting and create artifacts
# in the node.
sleepAfterInit: false
# -- Enable check of service source ranges (currently, only for LoadBalancer).
svcSourceRangeCheck: true
# -- Synchronize Kubernetes nodes to kvstore and perform CNP GC.
synchronizeK8sNodes: true
# -- Configure TLS configuration in the agent.
tls:
  # -- This configures how the Cilium agent loads the secrets used TLS-aware CiliumNetworkPolicies
  # (namely the secrets referenced by terminatingTLS and originatingTLS).
  # Possible values:
  #   - local
  #   - k8s
  secretsBackend: local
  # -- Base64 encoded PEM values for the CA certificate and private key.
  # This can be used as common CA to generate certificates used by hubble and clustermesh components.
  # It is neither required nor used when cert-manager is used to generate the certificates.
  ca:
    # -- Optional CA cert. If it is provided, it will be used by cilium to
    # generate all other certificates. Otherwise, an ephemeral CA is generated.
    cert: ""
    # -- Optional CA private key. If it is provided, it will be used by cilium to
    # generate all other certificates. Otherwise, an ephemeral CA is generated.
    key: ""
    # -- Generated certificates validity duration in days. This will be used for auto generated CA.
    certValidityDuration: 1095
  # -- Configure the CA trust bundle used for the validation of the certificates
  # leveraged by hubble and clustermesh. When enabled, it overrides the content of the
  # 'ca.crt' field of the respective certificates, allowing for CA rotation with no down-time.
  caBundle:
    # -- Enable the use of the CA trust bundle.
    enabled: false
    # -- Name of the ConfigMap containing the CA trust bundle.
    name: cilium-root-ca.crt
    # -- Entry of the ConfigMap containing the CA trust bundle.
    key: ca.crt
    # -- Use a Secret instead of a ConfigMap.
    useSecret: false
    # If uncommented, creates the ConfigMap and fills it with the specified content.
    # Otherwise, the ConfigMap is assumed to be already present in .Release.Namespace.
    #
    # content: |
    #   -----BEGIN CERTIFICATE-----
    #   ...
    #   -----END CERTIFICATE-----
    #   -----BEGIN CERTIFICATE-----
    #   ...
    #   -----END CERTIFICATE-----
# -- Tunneling protocol to use in tunneling mode and for ad-hoc tunnels.
# Possible values:
#   - ""
#   - vxlan
#   - geneve
# @default -- `"vxlan"`
tunnelProtocol: ""
# -- Enable native-routing mode or tunneling mode.
# Possible values:
#   - ""
#   - native
#   - tunnel
# @default -- `"tunnel"`
routingMode: "native"
# -- Configure VXLAN and Geneve tunnel port.
# @default -- Port 8472 for VXLAN, Port 6081 for Geneve
tunnelPort: 0
# -- Configure what the response should be to traffic for a service without backends.
# "reject" only works on kernels >= 5.10, on lower kernels we fallback to "drop".
# Possible values:
#  - reject (default)
#  - drop
serviceNoBackendResponse: reject
# -- Configure the underlying network MTU to overwrite auto-detected MTU.
# This value doesn't change the host network interface MTU i.e. eth0 or ens0.
# It changes the MTU for cilium_net@cilium_host, cilium_host@cilium_net,
# cilium_vxlan and lxc_health interfaces.
MTU: 1500
# -- Disable the usage of CiliumEndpoint CRD.
disableEndpointCRD: false
wellKnownIdentities:
  # -- Enable the use of well-known identities.
  enabled: false
etcd:
  # -- Enable etcd mode for the agent.
  enabled: false
  # -- List of etcd endpoints
  endpoints:
    - https://CHANGE-ME:2379
  # -- Enable use of TLS/SSL for connectivity to etcd.
  ssl: false
operator:
  # -- Enable the cilium-operator component (required).
  enabled: true
  # -- Roll out cilium-operator pods automatically when configmap is updated.
  rollOutPods: false
  # -- cilium-operator image.
  image:
    # @schema
    # type: [null, string]
    # @schema
    override: ~
    repository: "quay.io/cilium/operator"
    tag: "v1.16.0"
    # operator-generic-digest
    genericDigest: "sha256:d6621c11c4e4943bf2998af7febe05be5ed6fdcf812b27ad4388f47022190316"
    # operator-azure-digest
    azureDigest: "sha256:dd7562e20bc72b55c65e2110eb98dca1dd2bbf6688b7d8cea2bc0453992c121d"
    # operator-aws-digest
    awsDigest: "sha256:8dbe47a77ba8e1a5b111647a43db10c213d1c7dfc9f9aab5ef7279321ad21a2f"
    # operator-alibabacloud-digest
    alibabacloudDigest: "sha256:d2d9f450f2fc650d74d4b3935f4c05736e61145b9c6927520ea52e1ebcf4f3ea"
    useDigest: true
    pullPolicy: "IfNotPresent"
    suffix: ""
  # -- Number of replicas to run for the cilium-operator deployment
  replicas: 1
  # -- The priority class to use for cilium-operator
  priorityClassName: ""
  # -- DNS policy for Cilium operator pods.
  # Ref: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
  dnsPolicy: ""
  # -- cilium-operator update strategy
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      # @schema
      # type: [integer, string]
      # @schema
      maxSurge: 25%
      # @schema
      # type: [integer, string]
      # @schema
      maxUnavailable: 50%
  # -- Affinity for cilium-operator
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              io.cilium/app: operator
  # -- Pod topology spread constraints for cilium-operator
  topologySpreadConstraints: []
  #   - maxSkew: 1
  #     topologyKey: topology.kubernetes.io/zone
  #     whenUnsatisfiable: DoNotSchedule

  # -- Node labels for cilium-operator pod assignment
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
  nodeSelector:
    kubernetes.io/os: linux
  # -- Node tolerations for cilium-operator scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - operator: Exists
      # - key: "key"
      #   operator: "Equal|Exists"
      #   value: "value"
      #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
  # -- Additional cilium-operator container arguments.
  extraArgs: []
  # -- Additional cilium-operator environment variables.
  extraEnv: []
  # -- Additional cilium-operator hostPath mounts.
  extraHostPathMounts: []
  # - name: host-mnt-data
  #   mountPath: /host/mnt/data
  #   hostPath: /mnt/data
  #   hostPathType: Directory
  #   readOnly: true
  #   mountPropagation: HostToContainer

  # -- Additional cilium-operator volumes.
  extraVolumes: []
  # -- Additional cilium-operator volumeMounts.
  extraVolumeMounts: []
  # -- Annotations to be added to all top-level cilium-operator objects (resources under templates/cilium-operator)
  annotations: {}
  # -- HostNetwork setting
  hostNetwork: true
  # -- Security context to be added to cilium-operator pods
  podSecurityContext: {}
  # -- Annotations to be added to cilium-operator pods
  podAnnotations: {}
  # -- Labels to be added to cilium-operator pods
  podLabels: {}
  # PodDisruptionBudget settings
  podDisruptionBudget:
    # -- enable PodDisruptionBudget
    # ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
    enabled: false
    # @schema
    # type: [null, integer, string]
    # @schema
    # -- Minimum number/percentage of pods that should remain scheduled.
    # When it's set, maxUnavailable must be disabled by `maxUnavailable: null`
    minAvailable: null
    # @schema
    # type: [null, integer, string]
    # @schema
    # -- Maximum number/percentage of pods that may be made unavailable
    maxUnavailable: 1
  # -- cilium-operator resource limits & requests
  # ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
  resources: {}
  #   limits:
  #     cpu: 1000m
  #     memory: 1Gi
  #   requests:
  #     cpu: 100m
  #     memory: 128Mi

  # -- Security context to be added to cilium-operator pods
  securityContext: {}
  # runAsUser: 0

  # -- Interval for endpoint garbage collection.
  endpointGCInterval: "5m0s"
  # -- Interval for cilium node garbage collection.
  nodeGCInterval: "5m0s"
  # -- Interval for identity garbage collection.
  identityGCInterval: "15m0s"
  # -- Timeout for identity heartbeats.
  identityHeartbeatTimeout: "30m0s"
  pprof:
    # -- Enable pprof for cilium-operator
    enabled: false
    # -- Configure pprof listen address for cilium-operator
    address: localhost
    # -- Configure pprof listen port for cilium-operator
    port: 6061
  # -- Enable prometheus metrics for cilium-operator on the configured port at
  # /metrics
  prometheus:
    enabled: true
    port: 9963
    serviceMonitor:
      # -- Enable service monitors.
      # This requires the prometheus CRDs to be available (see https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
      enabled: false
      # -- Labels to add to ServiceMonitor cilium-operator
      labels: {}
      # -- Annotations to add to ServiceMonitor cilium-operator
      annotations: {}
      # -- jobLabel to add for ServiceMonitor cilium-operator
      jobLabel: ""
      # -- Interval for scrape metrics.
      interval: "10s"
      # @schema
      # type: [null, array]
      # @schema
      # -- Relabeling configs for the ServiceMonitor cilium-operator
      relabelings: ~
      # @schema
      # type: [null, array]
      # @schema
      # -- Metrics relabeling configs for the ServiceMonitor cilium-operator
      metricRelabelings: ~
  # -- Grafana dashboards for cilium-operator
  # grafana can import dashboards based on the label and value
  # ref: https://github.com/grafana/helm-charts/tree/main/charts/grafana#sidecar-for-dashboards
  dashboards:
    enabled: false
    label: grafana_dashboard
    # @schema
    # type: [null, string]
    # @schema
    namespace: ~
    labelValue: "1"
    annotations: {}
  # -- Skip CRDs creation for cilium-operator
  skipCRDCreation: false
  # -- Remove Cilium node taint from Kubernetes nodes that have a healthy Cilium
  # pod running.
  removeNodeTaints: true
  # @schema
  # type: [null, boolean]
  # @schema
  # -- Taint nodes where Cilium is scheduled but not running. This prevents pods
  # from being scheduled to nodes where Cilium is not the default CNI provider.
  # @default -- same as removeNodeTaints
  setNodeTaints: ~
  # -- Set Node condition NetworkUnavailable to 'false' with the reason
  # 'CiliumIsUp' for nodes that have a healthy Cilium pod.
  setNodeNetworkStatus: true
  unmanagedPodWatcher:
    # -- Restart any pod that are not managed by Cilium.
    restart: true
    # -- Interval, in seconds, to check if there are any pods that are not
    # managed by Cilium.
    intervalSeconds: 15
nodeinit:
  # -- Enable the node initialization DaemonSet
  enabled: false
  # -- node-init image.
  image:
    # @schema
    # type: [null, string]
    # @schema
    override: ~
    repository: "quay.io/cilium/startup-script"
    tag: "c54c7edeab7fde4da68e59acd319ab24af242c3f"
    digest: "sha256:8d7b41c4ca45860254b3c19e20210462ef89479bb6331d6760c4e609d651b29c"
    useDigest: true
    pullPolicy: "IfNotPresent"
  # -- The priority class to use for the nodeinit pod.
  priorityClassName: ""
  # -- node-init update strategy
  updateStrategy:
    type: RollingUpdate
  # -- Additional nodeinit environment variables.
  extraEnv: []
  # -- Additional nodeinit volumes.
  extraVolumes: []
  # -- Additional nodeinit volumeMounts.
  extraVolumeMounts: []
  # -- Affinity for cilium-nodeinit
  affinity: {}
  # -- Node labels for nodeinit pod assignment
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
  nodeSelector:
    kubernetes.io/os: linux
  # -- Node tolerations for nodeinit scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - operator: Exists
      # - key: "key"
      #   operator: "Equal|Exists"
      #   value: "value"
      #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
  # -- Annotations to be added to all top-level nodeinit objects (resources under templates/cilium-nodeinit)
  annotations: {}
  # -- Annotations to be added to node-init pods.
  podAnnotations: {}
  # -- Labels to be added to node-init pods.
  podLabels: {}
  # -- Security Context for cilium-node-init pods.
  podSecurityContext:
    # -- AppArmorProfile options for the `cilium-node-init` and init containers
    appArmorProfile:
      type: "Unconfined"
  # -- nodeinit resource limits & requests
  # ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
  resources:
    requests:
      cpu: 100m
      memory: 100Mi
  # -- Security context to be added to nodeinit pods.
  securityContext:
    privileged: false
    seLinuxOptions:
      level: 's0'
      # Running with spc_t since we have removed the privileged mode.
      # Users can change it to a different type as long as they have the
      # type available on the system.
      type: 'spc_t'
    capabilities:
      add:
        # Used in iptables. Consider removing once we are iptables-free
        - SYS_MODULE
        # Used for nsenter
        - NET_ADMIN
        - SYS_ADMIN
        - SYS_CHROOT
        - SYS_PTRACE
  # -- bootstrapFile is the location of the file where the bootstrap timestamp is
  # written by the node-init DaemonSet
  bootstrapFile: "/tmp/cilium-bootstrap.d/cilium-bootstrap-time"
  # -- startup offers way to customize startup nodeinit script (pre and post position)
  startup:
    preScript: ""
    postScript: ""
  # -- prestop offers way to customize prestop nodeinit script (pre and post position)
  prestop:
    preScript: ""
    postScript: ""
preflight:
  # -- Enable Cilium pre-flight resources (required for upgrade)
  enabled: false
  # -- Cilium pre-flight image.
  image:
    # @schema
    # type: [null, string]
    # @schema
    override: ~
    repository: "quay.io/cilium/cilium"
    tag: "v1.16.0"
    # cilium-digest
    digest: "sha256:46ffa4ef3cf6d8885dcc4af5963b0683f7d59daa90d49ed9fb68d3b1627fe058"
    useDigest: true
    pullPolicy: "IfNotPresent"
  # -- The priority class to use for the preflight pod.
  priorityClassName: ""
  # -- preflight update strategy
  updateStrategy:
    type: RollingUpdate
  # -- Additional preflight environment variables.
  extraEnv: []
  # -- Additional preflight volumes.
  extraVolumes: []
  # -- Additional preflight volumeMounts.
  extraVolumeMounts: []
  # -- Affinity for cilium-preflight
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              k8s-app: cilium
  # -- Node labels for preflight pod assignment
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
  nodeSelector:
    kubernetes.io/os: linux
  # -- Node tolerations for preflight scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - operator: Exists
      # - key: "key"
      #   operator: "Equal|Exists"
      #   value: "value"
      #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
  # -- Annotations to be added to all top-level preflight objects (resources under templates/cilium-preflight)
  annotations: {}
  # -- Security context to be added to preflight pods.
  podSecurityContext: {}
  # -- Annotations to be added to preflight pods
  podAnnotations: {}
  # -- Labels to be added to the preflight pod.
  podLabels: {}
  # PodDisruptionBudget settings
  podDisruptionBudget:
    # -- enable PodDisruptionBudget
    # ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
    enabled: false
    # @schema
    # type: [null, integer, string]
    # @schema
    # -- Minimum number/percentage of pods that should remain scheduled.
    # When it's set, maxUnavailable must be disabled by `maxUnavailable: null`
    minAvailable: null
    # @schema
    # type: [null, integer, string]
    # @schema
    # -- Maximum number/percentage of pods that may be made unavailable
    maxUnavailable: 1
  # -- preflight resource limits & requests
  # ref: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
  resources: {}
  #   limits:
  #     cpu: 4000m
  #     memory: 4Gi
  #   requests:
  #     cpu: 100m
  #     memory: 512Mi

  readinessProbe:
    # -- For how long kubelet should wait before performing the first probe
    initialDelaySeconds: 5
    # -- interval between checks of the readiness probe
    periodSeconds: 5
  # -- Security context to be added to preflight pods
  securityContext: {}
  #   runAsUser: 0

  # -- Path to write the `--tofqdns-pre-cache` file to.
  tofqdnsPreCache: ""
  # -- Configure termination grace period for preflight Deployment and DaemonSet.
  terminationGracePeriodSeconds: 1
  # -- By default we should always validate the installed CNPs before upgrading
  # Cilium. This will make sure the user will have the policies deployed in the
  # cluster with the right schema.
  validateCNPs: true
# -- Explicitly enable or disable priority class.
# .Capabilities.KubeVersion is unsettable in `helm template` calls,
# it depends on k8s libraries version that Helm was compiled against.
# This option allows to explicitly disable setting the priority class, which
# is useful for rendering charts for gke clusters in advance.
enableCriticalPriorityClass: true
# disableEnvoyVersionCheck removes the check for Envoy, which can be useful
# on AArch64 as the images do not currently ship a version of Envoy.
#disableEnvoyVersionCheck: false
clustermesh:
  # -- Deploy clustermesh-apiserver for clustermesh
  useAPIServer: false
  # -- The maximum number of clusters to support in a ClusterMesh. This value
  # cannot be changed on running clusters, and all clusters in a ClusterMesh
  # must be configured with the same value. Values > 255 will decrease the
  # maximum allocatable cluster-local identities.
  # Supported values are 255 and 511.
  maxConnectedClusters: 255
  # -- Enable the synchronization of Kubernetes EndpointSlices corresponding to
  # the remote endpoints of appropriately-annotated global services through ClusterMesh
  enableEndpointSliceSynchronization: false
  # -- Enable Multi-Cluster Services API support
  enableMCSAPISupport: false
  # -- Annotations to be added to all top-level clustermesh objects (resources under templates/clustermesh-apiserver and templates/clustermesh-config)
  annotations: {}
  # -- Clustermesh explicit configuration.
  config:
    # -- Enable the Clustermesh explicit configuration.
    enabled: false
    # -- Default dns domain for the Clustermesh API servers
    # This is used in the case cluster addresses are not provided
    # and IPs are used.
    domain: mesh.cilium.io
    # -- List of clusters to be peered in the mesh.
    clusters: []
    # clusters:
    # # -- Name of the cluster
    # - name: cluster1
    # # -- Address of the cluster, use this if you created DNS records for
    # # the cluster Clustermesh API server.
    #   address: cluster1.mesh.cilium.io
    # # -- Port of the cluster Clustermesh API server.
    #   port: 2379
    # # -- IPs of the cluster Clustermesh API server, use multiple ones when
    # # you have multiple IPs to access the Clustermesh API server.
    #   ips:
    #   - 172.18.255.201
    # # -- base64 encoded PEM values for the cluster client certificate, private key and certificate authority.
    # # These fields can (and should) be omitted in case the CA is shared across clusters. In that case, the
    # # "remote" private key and certificate available in the local cluster are automatically used instead.
    #   tls:
    #     cert: ""
    #     key: ""
    #     caCert: ""
  apiserver:
    # -- Clustermesh API server image.
    image:
      # @schema
      # type: [null, string]
      # @schema
      override: ~
      repository: "quay.io/cilium/clustermesh-apiserver"
      tag: "v1.16.0"
      # clustermesh-apiserver-digest
      digest: "sha256:a1597b7de97cfa03f1330e6b784df1721eb69494cd9efb0b3a6930680dfe7a8e"
      useDigest: true
      pullPolicy: "IfNotPresent"
    # -- TCP port for the clustermesh-apiserver health API.
    healthPort: 9880
    # -- Configuration for the clustermesh-apiserver readiness probe.
    readinessProbe: {}
    etcd:
      # The etcd binary is included in the clustermesh API server image, so the same image from above is reused.
      # Independent override isn't supported, because clustermesh-apiserver is tested against the etcd version it is
      # built with.

      # -- Specifies the resources for etcd container in the apiserver
      resources: {}
      #   requests:
      #     cpu: 200m
      #     memory: 256Mi
      #   limits:
      #     cpu: 1000m
      #     memory: 256Mi

      # -- Security context to be added to clustermesh-apiserver etcd containers
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
      # -- lifecycle setting for the etcd container
      lifecycle: {}
      init:
        # -- Specifies the resources for etcd init container in the apiserver
        resources: {}
        #   requests:
        #     cpu: 100m
        #     memory: 100Mi
        #   limits:
        #     cpu: 100m
        #     memory: 100Mi

        # -- Additional arguments to `clustermesh-apiserver etcdinit`.
        extraArgs: []
        # -- Additional environment variables to `clustermesh-apiserver etcdinit`.
        extraEnv: []
      # @schema
      # enum: [Disk, Memory]
      # @schema
      # -- Specifies whether etcd data is stored in a temporary volume backed by
      # the node's default medium, such as disk, SSD or network storage (Disk), or
      # RAM (Memory). The Memory option enables improved etcd read and write
      # performance at the cost of additional memory usage, which counts against
      # the memory limits of the container.
      storageMedium: Disk
    kvstoremesh:
      # -- Enable KVStoreMesh. KVStoreMesh caches the information retrieved
      # from the remote clusters in the local etcd instance.
      enabled: true
      # -- TCP port for the KVStoreMesh health API.
      healthPort: 9881
      # -- Configuration for the KVStoreMesh readiness probe.
      readinessProbe: {}
      # -- Additional KVStoreMesh arguments.
      extraArgs: []
      # -- Additional KVStoreMesh environment variables.
      extraEnv: []
      # -- Resource requests and limits for the KVStoreMesh container
      resources: {}
      #   requests:
      #     cpu: 100m
      #     memory: 64Mi
      #   limits:
      #     cpu: 1000m
      #     memory: 1024M

      # -- Additional KVStoreMesh volumeMounts.
      extraVolumeMounts: []
      # -- KVStoreMesh Security context
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
      # -- lifecycle setting for the KVStoreMesh container
      lifecycle: {}
    service:
      # -- The type of service used for apiserver access.
      type: NodePort
      # -- Optional port to use as the node port for apiserver access.
      #
      # WARNING: make sure to configure a different NodePort in each cluster if
      # kube-proxy replacement is enabled, as Cilium is currently affected by a known
      # bug (#24692) when NodePorts are handled by the KPR implementation. If a service
      # with the same NodePort exists both in the local and the remote cluster, all
      # traffic originating from inside the cluster and targeting the corresponding
      # NodePort will be redirected to a local backend, regardless of whether the
      # destination node belongs to the local or the remote cluster.
      nodePort: 32379
      # -- Annotations for the clustermesh-apiserver
      # For GKE LoadBalancer, use annotation cloud.google.com/load-balancer-type: "Internal"
      # For EKS LoadBalancer, use annotation service.beta.kubernetes.io/aws-load-balancer-internal: "true"
      annotations: {}
      # @schema
      # enum: [Local, Cluster]
      # @schema
      # -- The externalTrafficPolicy of service used for apiserver access.
      externalTrafficPolicy: Cluster
      # @schema
      # enum: [Local, Cluster]
      # @schema
      # -- The internalTrafficPolicy of service used for apiserver access.
      internalTrafficPolicy: Cluster
      # @schema
      # enum: [HAOnly, Always, Never]
      # @schema
      # -- Defines when to enable session affinity.
      # Each replica in a clustermesh-apiserver deployment runs its own discrete
      # etcd cluster. Remote clients connect to one of the replicas through a
      # shared Kubernetes Service. A client reconnecting to a different backend
      # will require a full resync to ensure data integrity. Session affinity
      # can reduce the likelihood of this happening, but may not be supported
      # by all cloud providers.
      # Possible values:
      #  - "HAOnly" (default) Only enable session affinity for deployments with more than 1 replica.
      #  - "Always" Always enable session affinity.
      #  - "Never" Never enable session affinity. Useful in environments where
      #            session affinity is not supported, but may lead to slightly
      #            degraded performance due to more frequent reconnections.
      enableSessionAffinity: "HAOnly"
      # @schema
      # type: [null, string]
      # @schema
      # -- Configure a loadBalancerClass.
      # Allows to configure the loadBalancerClass on the clustermesh-apiserver
      # LB service in case the Service type is set to LoadBalancer
      # (requires Kubernetes 1.24+).
      loadBalancerClass: ~
      # @schema
      # type: [null, string]
      # @schema
      # -- Configure a specific loadBalancerIP.
      # Allows to configure a specific loadBalancerIP on the clustermesh-apiserver
      # LB service in case the Service type is set to LoadBalancer.
      loadBalancerIP: ~
    # -- Number of replicas run for the clustermesh-apiserver deployment.
    replicas: 1
    # -- lifecycle setting for the apiserver container
    lifecycle: {}
    # -- terminationGracePeriodSeconds for the clustermesh-apiserver deployment
    terminationGracePeriodSeconds: 30
    # -- Additional clustermesh-apiserver arguments.
    extraArgs: []
    # -- Additional clustermesh-apiserver environment variables.
    extraEnv: []
    # -- Additional clustermesh-apiserver volumes.
    extraVolumes: []
    # -- Additional clustermesh-apiserver volumeMounts.
    extraVolumeMounts: []
    # -- Security context to be added to clustermesh-apiserver containers
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
    # -- Security context to be added to clustermesh-apiserver pods
    podSecurityContext:
      runAsNonRoot: true
      runAsUser: 65532
      runAsGroup: 65532
      fsGroup: 65532
    # -- Annotations to be added to clustermesh-apiserver pods
    podAnnotations: {}
    # -- Labels to be added to clustermesh-apiserver pods
    podLabels: {}
    # PodDisruptionBudget settings
    podDisruptionBudget:
      # -- enable PodDisruptionBudget
      # ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
      enabled: false
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Minimum number/percentage of pods that should remain scheduled.
      # When it's set, maxUnavailable must be disabled by `maxUnavailable: null`
      minAvailable: null
      # @schema
      # type: [null, integer, string]
      # @schema
      # -- Maximum number/percentage of pods that may be made unavailable
      maxUnavailable: 1
    # -- Resource requests and limits for the clustermesh-apiserver
    resources: {}
    #   requests:
    #     cpu: 100m
    #     memory: 64Mi
    #   limits:
    #     cpu: 1000m
    #     memory: 1024M

    # -- Affinity for clustermesh.apiserver
    affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  k8s-app: clustermesh-apiserver
              topologyKey: kubernetes.io/hostname
    # -- Pod topology spread constraints for clustermesh-apiserver
    topologySpreadConstraints: []
    #   - maxSkew: 1
    #     topologyKey: topology.kubernetes.io/zone
    #     whenUnsatisfiable: DoNotSchedule

    # -- Node labels for pod assignment
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
    nodeSelector:
      kubernetes.io/os: linux
    # -- Node tolerations for pod assignment on nodes with taints
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
    tolerations: []
    # -- clustermesh-apiserver update strategy
    updateStrategy:
      type: RollingUpdate
      rollingUpdate:
        # @schema
        # type: [integer, string]
        # @schema
        maxSurge: 1
        # @schema
        # type: [integer, string]
        # @schema
        maxUnavailable: 0
    # -- The priority class to use for clustermesh-apiserver
    priorityClassName: ""
    tls:
      # -- Configure the clustermesh authentication mode.
      # Supported values:
      # - legacy:     All clusters access remote clustermesh instances with the same
      #               username (i.e., remote). The "remote" certificate must be
      #               generated with CN=remote if provided manually.
      # - migration:  Intermediate mode required to upgrade from legacy to cluster
      #               (and vice versa) with no disruption. Specifically, it enables
      #               the creation of the per-cluster usernames, while still using
      #               the common one for authentication. The "remote" certificate must
      #               be generated with CN=remote if provided manually (same as legacy).
      # - cluster:    Each cluster accesses remote etcd instances with a username
      #               depending on the local cluster name (i.e., remote-<cluster-name>).
      #               The "remote" certificate must be generated with CN=remote-<cluster-name>
      #               if provided manually. Cluster mode is meaningful only when the same
      #               CA is shared across all clusters part of the mesh.
      authMode: legacy
      # -- Allow users to provide their own certificates
      # Users may need to provide their certificates using
      # a mechanism that requires they provide their own secrets.
      # This setting does not apply to any of the auto-generated
      # mechanisms below, it only restricts the creation of secrets
      # via the `tls-provided` templates.
      enableSecrets: true
      # -- Configure automatic TLS certificates generation.
      # A Kubernetes CronJob is used the generate any
      # certificates not provided by the user at installation
      # time.
      auto:
        # -- When set to true, automatically generate a CA and certificates to
        # enable mTLS between clustermesh-apiserver and external workload instances.
        # If set to false, the certs to be provided by setting appropriate values below.
        enabled: true
        # Sets the method to auto-generate certificates. Supported values:
        # - helm:         This method uses Helm to generate all certificates.
        # - cronJob:      This method uses a Kubernetes CronJob the generate any
        #                 certificates not provided by the user at installation
        #                 time.
        # - certmanager:  This method use cert-manager to generate & rotate certificates.
        method: helm
        # -- Generated certificates validity duration in days.
        certValidityDuration: 1095
        # -- Schedule for certificates regeneration (regardless of their expiration date).
        # Only used if method is "cronJob". If nil, then no recurring job will be created.
        # Instead, only the one-shot job is deployed to generate the certificates at
        # installation time.
        #
        # Due to the out-of-band distribution of client certs to external workloads the
        # CA is (re)regenerated only if it is not provided as a helm value and the k8s
        # secret is manually deleted.
        #
        # Defaults to none. Commented syntax gives midnight of the first day of every
        # fourth month. For syntax, see
        # https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-syntax
        # schedule: "0 0 1 */4 *"

        # [Example]
        # certManagerIssuerRef:
        #   group: cert-manager.io
        #   kind: ClusterIssuer
        #   name: ca-issuer
        # -- certmanager issuer used when clustermesh.apiserver.tls.auto.method=certmanager.
        certManagerIssuerRef: {}
      # -- base64 encoded PEM values for the clustermesh-apiserver server certificate and private key.
      # Used if 'auto' is not enabled.
      server:
        cert: ""
        key: ""
        # -- Extra DNS names added to certificate when it's auto generated
        extraDnsNames: []
        # -- Extra IP addresses added to certificate when it's auto generated
        extraIpAddresses: []
      # -- base64 encoded PEM values for the clustermesh-apiserver admin certificate and private key.
      # Used if 'auto' is not enabled.
      admin:
        cert: ""
        key: ""
      # -- base64 encoded PEM values for the clustermesh-apiserver client certificate and private key.
      # Used if 'auto' is not enabled.
      client:
        cert: ""
        key: ""
      # -- base64 encoded PEM values for the clustermesh-apiserver remote cluster certificate and private key.
      # Used if 'auto' is not enabled.
      remote:
        cert: ""
        key: ""
    # clustermesh-apiserver Prometheus metrics configuration
    metrics:
      # -- Enables exporting apiserver metrics in OpenMetrics format.
      enabled: true
      # -- Configure the port the apiserver metric server listens on.
      port: 9962
      kvstoremesh:
        # -- Enables exporting KVStoreMesh metrics in OpenMetrics format.
        enabled: true
        # -- Configure the port the KVStoreMesh metric server listens on.
        port: 9964
      etcd:
        # -- Enables exporting etcd metrics in OpenMetrics format.
        enabled: true
        # -- Set level of detail for etcd metrics; specify 'extensive' to include server side gRPC histogram metrics.
        mode: basic
        # -- Configure the port the etcd metric server listens on.
        port: 9963
      serviceMonitor:
        # -- Enable service monitor.
        # This requires the prometheus CRDs to be available (see https://github.com/prometheus-operator/prometheus-operator/blob/main/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml)
        enabled: false
        # -- Labels to add to ServiceMonitor clustermesh-apiserver
        labels: {}
        # -- Annotations to add to ServiceMonitor clustermesh-apiserver
        annotations: {}
        # -- Specify the Kubernetes namespace where Prometheus expects to find
        # service monitors configured.
        # namespace: ""

        # -- Interval for scrape metrics (apiserver metrics)
        interval: "10s"
        # @schema
        # type: [null, array]
        # @schema
        # -- Relabeling configs for the ServiceMonitor clustermesh-apiserver (apiserver metrics)
        relabelings: ~
        # @schema
        # type: [null, array]
        # @schema
        # -- Metrics relabeling configs for the ServiceMonitor clustermesh-apiserver (apiserver metrics)
        metricRelabelings: ~
        kvstoremesh:
          # -- Interval for scrape metrics (KVStoreMesh metrics)
          interval: "10s"
          # @schema
          # type: [null, array]
          # @schema
          # -- Relabeling configs for the ServiceMonitor clustermesh-apiserver (KVStoreMesh metrics)
          relabelings: ~
          # @schema
          # type: [null, array]
          # @schema
          # -- Metrics relabeling configs for the ServiceMonitor clustermesh-apiserver (KVStoreMesh metrics)
          metricRelabelings: ~
        etcd:
          # -- Interval for scrape metrics (etcd metrics)
          interval: "10s"
          # @schema
          # type: [null, array]
          # @schema
          # -- Relabeling configs for the ServiceMonitor clustermesh-apiserver (etcd metrics)
          relabelings: ~
          # @schema
          # type: [null, array]
          # @schema
          # -- Metrics relabeling configs for the ServiceMonitor clustermesh-apiserver (etcd metrics)
          metricRelabelings: ~
# -- Configure external workloads support
externalWorkloads:
  # -- Enable support for external workloads, such as VMs (false by default).
  enabled: false
# -- Configure cgroup related configuration
cgroup:
  autoMount:
    # -- Enable auto mount of cgroup2 filesystem.
    # When `autoMount` is enabled, cgroup2 filesystem is mounted at
    # `cgroup.hostRoot` path on the underlying host and inside the cilium agent pod.
    # If users disable `autoMount`, it's expected that users have mounted
    # cgroup2 filesystem at the specified `cgroup.hostRoot` volume, and then the
    # volume will be mounted inside the cilium agent pod at the same path.
    enabled: true
    # -- Init Container Cgroup Automount resource limits & requests
    resources: {}
    #   limits:
    #     cpu: 100m
    #     memory: 128Mi
    #   requests:
    #     cpu: 100m
    #     memory: 128Mi
  # -- Configure cgroup root where cgroup2 filesystem is mounted on the host (see also: `cgroup.autoMount`)
  hostRoot: /run/cilium/cgroupv2
# -- Configure sysctl override described in #20072.
sysctlfix:
  # -- Enable the sysctl override. When enabled, the init container will mount the /proc of the host so that the `sysctlfix` utility can execute.
  enabled: true
# -- Configure whether to enable auto detect of terminating state for endpoints
# in order to support graceful termination.
enableK8sTerminatingEndpoint: true
# -- Configure whether to unload DNS policy rules on graceful shutdown
# dnsPolicyUnloadOnShutdown: false

# -- Configure the key of the taint indicating that Cilium is not ready on the node.
# When set to a value starting with `ignore-taint.cluster-autoscaler.kubernetes.io/`, the Cluster Autoscaler will ignore the taint on its decisions, allowing the cluster to scale up.
agentNotReadyTaintKey: "node.cilium.io/agent-not-ready"
dnsProxy:
  # -- Timeout (in seconds) when closing the connection between the DNS proxy and the upstream server. If set to 0, the connection is closed immediately (with TCP RST). If set to -1, the connection is closed asynchronously in the background.
  socketLingerTimeout: 10
  # -- DNS response code for rejecting DNS requests, available options are '[nameError refused]'.
  dnsRejectResponseCode: refused
  # -- Allow the DNS proxy to compress responses to endpoints that are larger than 512 Bytes or the EDNS0 option, if present.
  enableDnsCompression: true
  # -- Maximum number of IPs to maintain per FQDN name for each endpoint.
  endpointMaxIpPerHostname: 50
  # -- Time during which idle but previously active connections with expired DNS lookups are still considered alive.
  idleConnectionGracePeriod: 0s
  # -- Maximum number of IPs to retain for expired DNS lookups with still-active connections.
  maxDeferredConnectionDeletes: 10000
  # -- The minimum time, in seconds, to use DNS data for toFQDNs policies. If
  # the upstream DNS server returns a DNS record with a shorter TTL, Cilium
  # overwrites the TTL with this value. Setting this value to zero means that
  # Cilium will honor the TTLs returned by the upstream DNS server.
  minTtl: 0
  # -- DNS cache data at this path is preloaded on agent startup.
  preCache: ""
  # -- Global port on which the in-agent DNS proxy should listen. Default 0 is a OS-assigned port.
  proxyPort: 0
  # -- The maximum time the DNS proxy holds an allowed DNS response before sending it along. Responses are sent as soon as the datapath is updated with the new IP information.
  proxyResponseMaxDelay: 100ms
  # -- DNS proxy operation mode (true/false, or unset to use version dependent defaults)
  # enableTransparentMode: true
# -- SCTP Configuration Values
sctp:
  # -- Enable SCTP support. NOTE: Currently, SCTP support does not support rewriting ports or multihoming.
  enabled: false
# Configuration for types of authentication for Cilium (beta)
authentication:
  # -- Enable authentication processing and garbage collection.
  # Note that if disabled, policy enforcement will still block requests that require authentication.
  # But the resulting authentication requests for these requests will not be processed, therefore the requests not be allowed.
  enabled: true
  # -- Buffer size of the channel Cilium uses to receive authentication events from the signal map.
  queueSize: 1024
  # -- Buffer size of the channel Cilium uses to receive certificate expiration events from auth handlers.
  rotatedIdentitiesQueueSize: 1024
  # -- Interval for garbage collection of auth map entries.
  gcInterval: "5m0s"
  # Configuration for Cilium's service-to-service mutual authentication using TLS handshakes.
  # Note that this is not full mTLS support without also enabling encryption of some form.
  # Current encryption options are WireGuard or IPsec, configured in encryption block above.
  mutual:
    # -- Port on the agent where mutual authentication handshakes between agents will be performed
    port: 4250
    # -- Timeout for connecting to the remote node TCP socket
    connectTimeout: 5s
    # Settings for SPIRE
    spire:
      # -- Enable SPIRE integration (beta)
      enabled: false
      # -- Annotations to be added to all top-level spire objects (resources under templates/spire)
      annotations: {}
      # Settings to control the SPIRE installation and configuration
      install:
        # -- Enable SPIRE installation.
        # This will only take effect only if authentication.mutual.spire.enabled is true
        enabled: true
        # -- SPIRE namespace to install into
        namespace: cilium-spire
        # -- SPIRE namespace already exists. Set to true if Helm should not create, manage, and import the SPIRE namespace.
        existingNamespace: false
        # -- init container image of SPIRE agent and server
        initImage:
          # @schema
          # type: [null, string]
          # @schema
          override: ~
          repository: "docker.io/library/busybox"
          tag: "1.36.1"
          digest: "sha256:9ae97d36d26566ff84e8893c64a6dc4fe8ca6d1144bf5b87b2b85a32def253c7"
          useDigest: true
          pullPolicy: "IfNotPresent"
        # SPIRE agent configuration
        agent:
          # -- SPIRE agent image
          image:
            # @schema
            # type: [null, string]
            # @schema
            override: ~
            repository: "ghcr.io/spiffe/spire-agent"
            tag: "1.9.6"
            digest: "sha256:5106ac601272a88684db14daf7f54b9a45f31f77bb16a906bd5e87756ee7b97c"
            useDigest: true
            pullPolicy: "IfNotPresent"
          # -- SPIRE agent service account
          serviceAccount:
            create: true
            name: spire-agent
          # -- SPIRE agent annotations
          annotations: {}
          # -- SPIRE agent labels
          labels: {}
          # -- SPIRE Workload Attestor kubelet verification.
          skipKubeletVerification: true
          # -- SPIRE agent tolerations configuration
          # By default it follows the same tolerations as the agent itself
          # to allow the Cilium agent on this node to connect to SPIRE.
          # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
          tolerations:
            - key: node.kubernetes.io/not-ready
              effect: NoSchedule
            - key: node-role.kubernetes.io/master
              effect: NoSchedule
            - key: node-role.kubernetes.io/control-plane
              effect: NoSchedule
            - key: node.cloudprovider.kubernetes.io/uninitialized
              effect: NoSchedule
              value: "true"
            - key: CriticalAddonsOnly
              operator: "Exists"
          # -- SPIRE agent affinity configuration
          affinity: {}
          # -- SPIRE agent nodeSelector configuration
          # ref: ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
          nodeSelector: {}
          # -- Security context to be added to spire agent pods.
          # SecurityContext holds pod-level security attributes and common container settings.
          # ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod
          podSecurityContext: {}
          # -- Security context to be added to spire agent containers.
          # SecurityContext holds pod-level security attributes and common container settings.
          # ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
          securityContext: {}
        server:
          # -- SPIRE server image
          image:
            # @schema
            # type: [null, string]
            # @schema
            override: ~
            repository: "ghcr.io/spiffe/spire-server"
            tag: "1.9.6"
            digest: "sha256:59a0b92b39773515e25e68a46c40d3b931b9c1860bc445a79ceb45a805cab8b4"
            useDigest: true
            pullPolicy: "IfNotPresent"
          # -- SPIRE server service account
          serviceAccount:
            create: true
            name: spire-server
          # -- SPIRE server init containers
          initContainers: []
          # -- SPIRE server annotations
          annotations: {}
          # -- SPIRE server labels
          labels: {}
          # SPIRE server service configuration
          service:
            # -- Service type for the SPIRE server service
            type: ClusterIP
            # -- Annotations to be added to the SPIRE server service
            annotations: {}
            # -- Labels to be added to the SPIRE server service
            labels: {}
          # -- SPIRE server affinity configuration
          affinity: {}
          # -- SPIRE server nodeSelector configuration
          # ref: ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
          nodeSelector: {}
          # -- SPIRE server tolerations configuration
          # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
          tolerations: []
          # SPIRE server datastorage configuration
          dataStorage:
            # -- Enable SPIRE server data storage
            enabled: true
            # -- Size of the SPIRE server data storage
            size: 1Gi
            # -- Access mode of the SPIRE server data storage
            accessMode: ReadWriteOnce
            # @schema
            # type: [null, string]
            # @schema
            # -- StorageClass of the SPIRE server data storage
            storageClass: null
          # -- Security context to be added to spire server pods.
          # SecurityContext holds pod-level security attributes and common container settings.
          # ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod
          podSecurityContext: {}
          # -- Security context to be added to spire server containers.
          # SecurityContext holds pod-level security attributes and common container settings.
          # ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-container
          securityContext: {}
          # SPIRE CA configuration
          ca:
            # -- SPIRE CA key type
            # AWS requires the use of RSA. EC cryptography is not supported
            keyType: "rsa-4096"
            # -- SPIRE CA Subject
            subject:
              country: "US"
              organization: "SPIRE"
              commonName: "Cilium SPIRE CA"
      # @schema
      # type: [null, string]
      # @schema
      # -- SPIRE server address used by Cilium Operator
      #
      # If k8s Service DNS along with port number is used (e.g. <service-name>.<namespace>.svc(.*):<port-number> format),
      # Cilium Operator will resolve its address by looking up the clusterIP from Service resource.
      #
      # Example values: 10.0.0.1:8081, spire-server.cilium-spire.svc:8081
      serverAddress: ~
      # -- SPIFFE trust domain to use for fetching certificates
      trustDomain: spiffe.cilium
      # -- SPIRE socket path where the SPIRE delegated api agent is listening
      adminSocketPath: /run/spire/sockets/admin.sock
      # -- SPIRE socket path where the SPIRE workload agent is listening.
      # Applies to both the Cilium Agent and Operator
      agentSocketPath: /run/spire/sockets/agent/agent.sock
      # -- SPIRE connection timeout
      connectionTimeout: 30s

If you wish to check the default values.yaml, you could run

helm show values cilium/cilium

Install Cilium

helm -n kube-system install cilium cilium/cilium --values ./values.yaml

You should see all pods are deployed successfully with this command.

kubectl get pods -A

And Cilium should be in charge of the cluster's networking.

Install cilium CLI tool: https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli

You should see something similar to the output below.

cilium status

    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet              cilium-envoy       Desired: 1, Ready: 1/1, Available: 1/1
Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet              cilium             Desired: 1, Ready: 1/1, Available: 1/1
Containers:            hubble-ui          Running: 1
                       cilium-envoy       Running: 1
                       cilium-operator    Running: 1
                       cilium             Running: 1
                       hubble-relay       Running: 1
Cluster Pods:          59/59 managed by Cilium
Helm chart version:
Image versions         cilium             quay.io/cilium/cilium:v1.16.0@sha256:46ffa4ef3cf6d8885dcc4af5963b0683f7d59daa90d49ed9fb68d3b1627fe058: 1
                       hubble-relay       quay.io/cilium/hubble-relay:v1.16.0@sha256:33fca7776fc3d7b2abe08873319353806dc1c5e07e12011d7da4da05f836ce8d: 1
                       hubble-ui          quay.io/cilium/hubble-ui:v0.13.1@sha256:e2e9313eb7caf64b0061d9da0efbdad59c6c461f6ca1752768942bfeda0796c6: 1
                       hubble-ui          quay.io/cilium/hubble-ui-backend:v0.13.1@sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b: 1
                       cilium-envoy       quay.io/cilium/cilium-envoy:v1.29.7-39a2a56bbd5b3a591f69dbca51d3e30ef97e0e51@sha256:bd5ff8c66716080028f414ec1cb4f7dc66f40d2fb5a009fff187f4a9b90b566b: 1
                       cilium-operator    quay.io/cilium/operator-generic:v1.16.0@sha256:d6621c11c4e4943bf2998af7febe05be5ed6fdcf812b27ad4388f47022190316: 1

L2 Announcment and Load Balancer IP range

Apply this settings to activate L2 announcement for load balancer services.

apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-policy
  namespace: kube-system
spec:
  interfaces:
  - eno1
  externalIPs: true
  loadBalancerIPs: true
---

apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: default-pool
  namespace: kube-system
spec:
  blocks:
  - start: "192.168.99.60"
    stop: "192.168.99.80"

kubectl -n kube-system apply -f loadbalancer-related.yaml

Bonus

`crictl`

Some times you might want to use nsenter on some pods, but before you do this, you need to find it's PID. crictl can help you do this very easily.

crictl installation: https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md

crictl pods --namespace test

POD ID              CREATED             STATE               NAME                              NAMESPACE           ATTEMPT             RUNTIME
93184eb97ff7d       5 days ago          Ready               test-application-8bdd4658-n96pd   test                0                   (default)
0cb345d9b4ae1       3 weeks ago         Ready               nginx-6df6f855d4-9bdf4            test                0                   (default)
d8dd8f09ccf1d       3 weeks ago         Ready               httpbin-6c78b8c76b-ddjgs          test                0                   (default)
b0fea361e3c0a       3 weeks ago         Ready               echo-same-node-c74f867d5-hznd5    cilium-test         0                   (default)
60e96562ba65f       3 weeks ago         Ready               client2-7b7957dc98-dtfhh          cilium-test         0                   (default)
9df92b7bdfae8       3 weeks ago         Ready               client-d48766cfd-q6q9c            cilium-test         0                   (default)
15cd5237e096f       3 weeks ago         Ready               dind-57bbc5b444-r7ghf             test                0                   (default)

Use the POD ID, and go template to print our the info we want

crictl inspectp --output go-template --template '{{ .info.pid }}' 60e96562ba65f

`etcdctl`

Here are some useful snippets if you ever need to access etcd directly.

https://gist.github.com/superseb/0c06164eef5a097c66e810fe91a9d408

# ~/.bashrc
export ETCDCTL_ENDPOINTS='https://127.0.0.1:2379'
export ETCDCTL_CACERT='/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt'
export ETCDCTL_CERT='/var/lib/rancher/k3s/server/tls/etcd/server-client.crt'
export ETCDCTL_KEY='/var/lib/rancher/k3s/server/tls/etcd/server-client.key'
export ETCDCTL_API=3

etcdctl endpoint health -w table

+------------------------+--------+------------+-------+
|        ENDPOINT        | HEALTH |    TOOK    | ERROR |
+------------------------+--------+------------+-------+
| https://127.0.0.1:2379 |   true | 1.868725ms |       |
+------------------------+--------+------------+-------+

Check current kubelet config

kubectl proxy

curl -X GET http://127.0.0.1:8001/api/v1/nodes/alex-server/proxy/configz | jq .

We can check if our configuration has been applied.

{
    ...
    "maxPods": 250,
    "podPidsLimit": -1,
    "resolvConf": "/root/infra/resolv.conf",
    "cpuCFSQuota": true,
    ...
}

Persistent storage

Pods, like containers, are throwaway environments. In order to persist data, we need to save them somewhere else. I went with longhorn and local-path provisioner.

Longhorn has replication and backup, supports RWO ( read write once ), RWX ( read write many ) and is easy to use. The installation is very simple as well, just remember to use Rancher Apps for installation if you wish to have integrated components with the Rancher UI. You might need to install open-iscsi and nfs-common before installing longhorn.

apt install open-iscsi
apt install nfs-common

local-path provisioner, while not having HA capability, is faster then longhorn and is pre-installed with k3s.

Backup

K3s creates a ETCD snapshot every day by default (/var/lib/rancher/k3s/server/db/snapshots), although this can be used to restore the entire cluster, it does not restore the contents of persistent volumes, for this, I use longhorn's recurring jobs to create backup every day and save them on a NFS server, this are all set using the longhorn UI.
longhorn recurring job

Longhorn recurring job to backup daily

longhorn backup list

A list of my backups

I don't have more machines or hard disk, so the NFS server runs on the same host, and this backup process basically just stores the data on another directory. A least it prevents me from accidentally deleting all the data if I wish to reinstall everything, and it serves as a nice practice for backup XD.

PS: The backup process was created before I had a separate server for VPN, the VPN server currently has 1TB of HDD, I might move the backup target there when I have time.

Gateway

We will be using traefik and istio as our gateway. traefik is easy to use, but has less control at the moment compared to istio, unless you're using the enterprise version.

traefik: As a simple ingress gateway for private services only accessible in my LAN.
istio: As the gateway for public services exposed to the outside world.

Installation

I installed both with helm.

Traefik

The defaults are nice, I just adjust the log level and enabled access logs.

# values.yaml
logs:
  access:
    enabled: true
  general:
    level: DEBUG

kubectl create ns traefik

helm -n traefik install traefik --values ./values.yaml --debug

Istio

Official installation guide: https://istio.io/latest/docs/setup/install/helm/

Add istio helm repo

helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update

create a namespace for istio

kubectl create namespace istio-system

Install CRD

helm install istio-base istio/base -n istio-system --set defaultRevision=default

Install istio

# istio-values.yaml
defaults:
  meshConfig:
    # resources deployed in the root namespace will be applied globally
    rootNamespace: cluster-gateway

helm install istiod istio/istiod -n istio-system --values ./istio-values.yaml

Enable access logging with telemetry

kind: Telemetry
metadata:
  name: default
  namespace: cluster-gateway
spec:
  accessLogging:
  - providers:
    - name: envoy

Install k8s gateway api

https://gateway-api.sigs.k8s.io/guides/#installing-a-gateway-controller

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.1.0/experimental-install.yaml

A simple httpbin deployment to make sure things are working

apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-test-httpbin
  template:
    metadata:
      labels:
        workload.user.cattle.io/workloadselector: apps.deployment-test-httpbin
    spec:
      containers:
      - image: kennethreitz/httpbin
        imagePullPolicy: Always
        name: container-0
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        resources: {}
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: test
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    workload.user.cattle.io/workloadselector: apps.deployment-test-httpbin
  type: ClusterIP

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: httpbin
  namespace: cluster-gateway
spec:
  hostnames:
  - istio-httpbin.cloud.alexfangsw.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: gateway
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: httpbin
      namespace: test
      port: 80
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /

---

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gateway
  namespace: cluster-gateway
  annotations:
    # automatically generate certificates with cert-manager, covered in the next section
    cert-manager.io/cluster-issuer: letsencrypt
spec:
  gatewayClassName: istio
  listeners:
    - allowedRoutes:
        namespaces:
          from: Same
      # An DNS record must be set for this domain in the local DNS.
      # Resolves as the load balancer IP of our gateway.
      hostname: istio-httpbin.cloud.alexfangsw.com
      name: httpbin-test-https
      port: 443
      protocol: HTTPS
      tls:
        certificateRefs:
          - group: ""
            kind: Secret
            name: istio-httpbin.cloud.alexfangsw.com
        mode: Terminate
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
  name: reference-grant
  namespace: test
spec:
  from:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      namespace: cluster-gateway
  to:
    - group: ""
      kind: Service

---
# The default operation is deny
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: allow
  namespace: cluster-gateway
spec:
  selector:
    matchLabels:
      istio.io/gateway-name: gateway
  action: ALLOW
  rules:
    - to:
        - operation:
            hosts:
              - istio-httpbin.cloud.alexfangsw.com

Check the endpoint

You might need to finish the next section first for the TLS certificate

curl https://istio-httpbin.cloud.alexfangsw.com/get

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "istio-httpbin.cloud.alexfangsw.com",
    "User-Agent": "curl/7.81.0",
    "X-Envoy-Attempt-Count": "1",
    "X-Envoy-Decorator-Operation": "httpbin.test.svc.cluster.local:80/*",
    "X-Envoy-Internal": "true",
    "X-Envoy-Peer-Metadata": "ChQKDkFQUF9DT05UQUlORVJTEgIaAAoaCgpDTFVTVEVSX0lEEgwaCkt1YmVybmV0ZXMKHgoMSU5TVEFOQ0VfSVBTEg4aDDEwLjIyMi4wLjEyNQoZCg1JU1RJT19WRVJTSU9OEggaBjEuMjIuMwrwAQoGTEFCRUxTEuUBKuIBCjMKJmdhdGV3YXkubmV0d29ya2luZy5rOHMuaW8vZ2F0ZXdheS1uYW1lEgkaB2dhdGV3YXkKIgoVaXN0aW8uaW8vZ2F0ZXdheS1uYW1lEgkaB2dhdGV3YXkKMgofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIPGg1nYXRld2F5LWlzdGlvCi8KI3NlcnZpY2UuaXN0aW8uaW8vY2Fub25pY2FsLXJldmlzaW9uEggaBmxhdGVzdAoiChdzaWRlY2FyLmlzdGlvLmlvL2luamVjdBIHGgVmYWxzZQoaCgdNRVNIX0lEEg8aDWNsdXN0ZXIubG9jYWwKKAoETkFNRRIgGh5nYXRld2F5LWlzdGlvLTc0NzhjNWM1NTQtbmoyY3YKHgoJTkFNRVNQQUNFEhEaD2NsdXN0ZXItZ2F0ZXdheQpZCgVPV05FUhJQGk5rdWJlcm5ldGVzOi8vYXBpcy9hcHBzL3YxL25hbWVzcGFjZXMvY2x1c3Rlci1nYXRld2F5L2RlcGxveW1lbnRzL2dhdGV3YXktaXN0aW8KIAoNV09SS0xPQURfTkFNRRIPGg1nYXRld2F5LWlzdGlv",
    "X-Envoy-Peer-Metadata-Id": "router~10.222.0.125~gateway-istio-7478c5c554-nj2cv.cluster-gateway~cluster-gateway.svc.cluster.local"
  },
  "origin": "192.168.99.82",
  "url": "https://istio-httpbin.cloud.alexfangsw.com/get"
}

TLS

Now that we have our gateway installed, it's time we add TLS to secure it. We will be using cert-manager to automatically generate certificates for us.

helm repo add jetstack https://charts.jetstack.io

DNS related setting are for letsencrypt's DNS challenge. enable-gateway-api is for cert-manager to auto generate certificates for istio ( ingress is supported by default ).

# values.yaml
extraArgs:
- --dns01-recursive-nameservers-only
- --dns01-recursive-nameservers=8.8.8.8:53,1.1.1.1:53
- --enable-gateway-api
installCRDs: true

Install cert-manager with a version you like

helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version <VERSION> \
  --set installCRDs=true
  --values ./values.yaml

Add cluster issuer for generating certificates. Remember to create a secret that contains the cloudflare token.

# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
  namespace: cert-manager
spec:
  acme:
    email: EMAIL
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-cluster-issuer-account
    solvers:
    - dns01:
        cloudflare:
          apiTokenSecretRef:
            name: cloudflare-api-token
            key: api-token
      selector:
        dnsZones:
        - 'alexfangsw.com'

kubectl -n cert-manager apply -f ./cluster-issuer.yaml

Now you could just add an annotation on your ingress and istio gateways, and cert-manager will generate the certificates for you.

cert-manager.io/cluster-issuer: letsencrypt

Solving the DNS challenge might take a minute. You can check the progress or error by running

kubectl get challenges.acme.cert-manager.io

Cluster UI

Install Rancher as the UI for our cluster.

Official installation guide: https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster

Necessary steps:

helm repo add rancher-latest https://releases.rancher.com/server-charts/latest

kubectl create namespace cattle-system

helm repo update

I installed rancher with default hostname as 'cloud.alexfangsw.com'. You can chose another one, but make sure to update the local DNS if you do. The bootstrapPassword is only used once, just set something you like.

helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=cloud.alexfangsw.com \
  --set bootstrapPassword=INIT_PASSWORD

Now we can head on to our cluster's home page, after you login, and click on the 'local' cluster you should see the resources visualized. rancher local cluster

You can add cert-manager annotations on the rancher ingress to generate the certificate for HTTPS.

cert-manager.io/cluster-issuer: letsencrypt

CI/CD

Drone CI

Usage:

Run tests
Build docker images
Run envsubst on local helm chart and update the central helm repository

My helm charts are placed in the same repository as the project I'm working on. But for simpler CD configuration, all helm charts are moved to the same git repository.

Installation

Official installation guide: https://github.com/drone/charts/blob/master/charts/drone/docs/install.md

Create a namespace for drone

kubectl create ns drone

Add drone helm chart

helm repo add drone https://charts.drone.io
helm repo update

Create an OAuth Application on GitHub: https://docs.drone.io/server/provider/github/

Install

# values.yaml

# reference: https://docs.drone.io/server/reference/
env:
  DRONE_GITHUB_CLIENT_ID: XXXXXX
  DRONE_GITHUB_CLIENT_SECRET: XXXXXX
  DRONE_RPC_SECRET: XXXXXX

  DRONE_SERVER_HOST: xxx.xxx.com
  DRONE_SERVER_PROTO: https

  # as the one installing drone, you might want to have admin access
  DRONE_USER_CREATE: username:YOUR_GITHUB_USERNAME,admin:true
  # limit users that have login access
  DRONE_USER_FILTER: YOUR_GITHUB_USERNAME

# we would be using longhorn
persistentVolume:
  size: 1Gi
  storageClass: longhorn

helm install --namespace drone drone drone/drone -f drone-values.yaml

Expose drone with istio for github webhook

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gateway
  namespace: cluster-gateway
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt
spec:
  gatewayClassName: istio
  listeners:
    - allowedRoutes:
        namespaces:
          from: Same
      hostname: xxx.xxx.com
      name: drone-https
      port: 443
      protocol: HTTPS
      tls:
        certificateRefs:
          - group: ""
            kind: Secret
            name: xxx.xxx.com
        mode: Terminate

---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: ReferenceGrant
metadata:
  name: reference-grant
  namespace: drone
spec:
  from:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      namespace: cluster-gateway
  to:
    - group: ""
      kind: Service

---
kind: AuthorizationPolicy
metadata:
  name: allow
  namespace: cluster-gateway
spec:
  selector:
    matchLabels:
      istio.io/gateway-name: gateway
  action: ALLOW
  rules:
    - to:
        - operation:
            hosts:
              - xxx.xxx.com
---

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: drone
  namespace: cluster-gateway
spec:
  hostnames:
  - xxx.xxx.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: gateway
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: drone
      namespace: drone
      port: 8080
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /

Install runners

# values.yaml

rbac:
  buildNamespaces:
    - drone

env:
  DRONE_RPC_SECRET: XXXXXXXX 
  DRONE_RPC_HOST: drone:8080
  DRONE_RPC_PROTO: http
  DRONE_NAMESPACE_DEFAULT: drone
  DRONE_RUNNER_CAPACITY: 5
  DRONE_RUNNER_PRIVILEGED_IMAGES: plugins/docker

helm -n drone install drone-runner-kube drone/drone-runner-kube --values values.yaml

Head on to the official documentation for details on how to create a drone pipeline.

Drone pipeline documentation: https://docs.drone.io/pipeline/kubernetes/overview/

After that, you should be able to start some pipelines and see them running. drone home page

If for some reason drone doesn't run, you could check webhook statuses on the webhooks tab in your repository.

Harbor

Local image registry. My network speed isn't the best, storing some stuffs locally saves some time, and it could be used as a cache as well.

Installation

Official installation guide: https://goharbor.io/docs/1.10/install-config/harbor-ha-helm/

I'm not sure why the official guide wants us to download the helm chart, I just installed it like any other chart.

kubectl create ns harbor

helm repo add harbor https://helm.goharbor.io

# values.yaml
core:
  replicas: 1
database:
  type: internal
exporter:
  replicas: 1
expose:
  clusterIP:
    annotations: {}
    labels: {}
    name: harbor
    ports:
      httpPort: 80
      httpsPort: 443
    staticClusterIP: ""
  ingress:
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt
    className: ""
    controller: default
    hosts:
      core: docker.cloud.alexfangsw.com
    kubeVersionOverride: ""
    labels: {}
  tls:
    auto:
      commonName: ""
    certSource: secret
    enabled: true
    secret:
      secretName: docker.cloud.alexfangsw.com
  type: ingress
externalURL: https://docker.cloud.alexfangsw.com
harborAdminPassword: XXXXXXXX_Change_After_installation
logLevel: info
persistence:
  enabled: true
  imageChartStorage:
    filesystem:
      rootdirectory: /storage
    type: filesystem
  persistentVolumeClaim:
    database:
      size: 1Gi
      storageClass: longhorn
    jobservice:
      jobLog:
        size: 1Gi
        storageClass: longhorn
    redis:
      size: 1Gi
      storageClass: longhorn
    registry:
      size: 10Gi
      storageClass: longhorn
    trivy:
      size: 5Gi
      storageClass: longhorn
  resourcePolicy: keep
portal:
  replicas: 1
redis:
  type: internal
registry:
  replicas: 1
secretKey: not-a-secure-key
trivy:
  replicas: 1

helm -n harbor install harbor harbor/harbor --values ./values.yaml

Remember to change the default admin password after the first login.

Caching images

Create proxy cache project in harbor and use it to pull images. Remember to add library before the image name if you are caching an official docker image from docker hub.

Official docker hub images:

DOCKER_HOST/CACHE_PROJECT/library/IMAGE_NAME

ex: docker.cloud.alexfangsw.com/cache/library/golang

Other images:

DOCKER_HOST/CACHE_PROJECT/XXXXX/IMAGE_NAME

ex: docker.cloud.alexfangsw.com/cache/plugins/docker

🚧 Observability

Monitor

For hardware and resource monitoring I went with Prometheus. The Prometheus stack which includes Prometheus, Grafana, and Alertmanger can be installed in Rancher's apps tab, and integrates with the Rancher UI, allowing us to easily view the resource consumption. It also includes CRD like service monitoring and pod monitoring, which can be used to scrape our own metrics, combined with alerting rules, we can send our customized alerts if needed. And Grafana is setup with allot of pre-installed graphs with useful information.

The only thing I would like to change is CRD regarding Alertmanger config, for simple rules, its fine, but if we need more control over the configuration, we will need to make some modifications during installation and make Alertmanager use our own config and not the config generated by the CRD.

My Homelab setup

Table of content

Overview

Hardware, OS

DNS

VPN

Container orchestration

Persistent Storage

API Gateway

CI / CD

Monitoring:

Network topology

Getting a domain

DDNS

VPN

Install OpenVPN server

Install OpenVPN and easy-rsa

Set up PKI ( public key infrastructure )

Create easy-rsa directory

Initialize and create CA

Server private key, certificate and Diffie Hellman

Client private key and certificate

Example configurations

Server

Client

Start the server

Routing configuration

Enable forwarding at /etc/sysctl.conf

Set up MASQUERADE for outgoing traffic from vpn client

Set iptables FORWARD policy to ACCEPT

Persist changes made to iptables

Start the client

Connection check

Local DNS

Installation

Start the DNS server

Set self hosted DNS as default

Edit /etc/systemd/resolved.conf

Restart systemd-resolved

Container orchestration

Installation

K3s related configuration

Install K3s

Cilium values.yaml

Install Cilium

L2 Announcment and Load Balancer IP range

Bonus

crictl

etcdctl

Check current kubelet config

Persistent storage

Backup

Gateway

Installation

Traefik

Istio

TLS

Cluster UI

CI/CD

Drone CI

Installation

Harbor

Caching images

🚧 Observability

Monitor

Network traffic

Logs

Enable forwarding at `/etc/sysctl.conf`

Edit `/etc/systemd/resolved.conf`

Restart `systemd-resolved`

`crictl`

`etcdctl`