A home lab Kubernetes cluster using AKS Arc on Azure Local, built with budget hardware yet having an enterprise-grade feature set.
Version: 11/25:1.0 public preview - Azure Local release 2510
Motivated by all the nice setups from Lior Kamrat, I want a home lab as well to run tools, gameservers, my own apps and test a few things from the CNCF landscape in my local network. But there is not enough budget for AI workloads 😜. lets skipt that for now.
Remarks - this will not be a comparsion about the best Kubernetes cluster or Virtual Machines home lab setup across all available offerings. I’ll focus on Azure Local and use Unify hardware -simply because I know it very well, it will be quicker for me than starting from scratch.
Ok - whats the goal: Create an Azure Local, run an AKS Arc Kubernetes cluster on it, and avoid having anything to do with an Active Directory Domain. I’ll use the “AD less” deployment - Deploy Azure Local using local identity with Azure Key Vault and create a Kubernetes cluster on it. There is no internet proxy to overcome at home, so I’ll skip using Arc Gateway for now. I’ll test offline mode and Arc Gateway in a second step.
Note - Azure Local using local identity with Azure Key Vault is still flagged as preview.
The much I like Ben’s home lab - there is no rack space at home 🤷♂️ - better to go the other direction: affordable hardware as small and silent as possbile. Something on the sweetspot of price/value, with a low power consumption. A thank you here to Stefan Denninger suggesting me using a Minisforum MS-01 - that little thing fulfills all low capacity Azure Local single node requirements except the ECC memory! but more about it later. I use for this project:
I run a UniFi Dream Machine Special Edition with a public static IP, Wirguard VPN to connect to my home network from everywhere and fullfilling all requirements for any kind of home lab you can imagine. Except the fact that I’m always running out of switch ports 🙄
You need a DNS server. One option is to use the DNS server from the Dream Machine, but for this setup I decided to use one of my test domains and the DNS server from the hosting provider. There are two DNS entries required:
hostname.mydomain.com -> 192.168.1.1azlmyname.mydomain.com -> 192.168.2.1 (the first IP of the 6 required one for Azure Local)First thing to do - connect a monitor, mouse, keyboard and set the primary display from ‘auto’ to ‘HG’ in Bios, otherwise I always had a black screen after a reboot. Looks like it’s not an issue of my monitor
I wanted to use Intel vPro / AMT - there are very good guides from https://www.youtube.com/@RaspberryPiCloud or Space Terran available. My advise - skip it. Even after finally figuring out the complex password policy for this in the BIOS, I failed to make it work within 2 hours and decided to use RemoteDesktop and PS remoting instead. The few times you need it - save your lifetime - connecting a monitor mouse and keyboard takes a few minutes.
Alright - lets get Auzre Local and create a bootable usb stick with Rufus. Meanwhile, perpare your Azure subscription:
Logon to the machine, Sconfig shows up. Enable Remote Desktop (alternative over Powershell with Enable-ASRemoteDesktop) and lets do the infra setup.
DHCP is not supported, lets configure the Intel X710 adapter IP (RSS support!). I use IP range 192.168.0.1/20 (192.168.0.0 - 192.168.15.255)
192.168.1.1 / 255.255.240.0
Gateway: 192.168.0.1
DNS: 192.168.0.1
Set the hostname to myhostname with Sconfig and restart the machine. Add a DNS entry myhostname.mydomain.com at your local DNS server / domain hoster - in my case the DNS server of the hosting provider. Verify with
nslookup myhostname
nslookup myhostname.mydomain.com
Create local admin user for Azure Local deployment
$Password = Read-Host -AsSecureString
$params = @{
Name = 'myuser'
Password = $Password
FullName = 'Firstname Lastname'
Description = 'local setup'
}
New-LocalUser @params
Add-LocalGroupMember -Group "Administrators" -Member "gstoifl"
I don’t have ECC memory modules and need to disable the check, I have luck, there is a feature flag to disable the ECC requirement check. This will skip ECC tests, documented in AzStackHciConnectivity\AzStackHci.Connectivity.psm1 and AzStackHciHardware\AzStackHci.Hardware.psm1
New-Item -Path "C:\Program Files\WindowsPowerShell\Modules\AzStackHci.EnvironmentChecker\ExcludeTests.txt" -ItemType File
Set-Content -Path "C:\Program Files\WindowsPowerShell\Modules\AzStackHci.EnvironmentChecker\ExcludeTests.txt" -Value "Test-MemoryProperties"
Register the machine with Azure Arc: Onboard by script
$Tenant = "<my tenant>"
$Subscription = "<my subscription">
$RG = "<my resource group>"
$Region = "<my region>"
Invoke-AzStackHciArcInitialization -TenantId $Tenant -SubscriptionID $Subscription -ResourceGroup $RG -Region $Region -Cloud "AzureCloud"
Open Azure portal and create an Azure Local as discribed here:
Instance name azlmyname
Group management and compute at network adapter 1 - 192.168.1.1
Nodes and Instance IPs:
192.168.2.1 - 192.168.2.254
255.255.240.0
Gateway: 192.168.0.1
DNS: 192.168.0.1
Create a DNS entry for the cluster object with the first IP from the range as documented here.
A record azlmyname.mydomain.com to 192.168.2.1
If validation fails at AzStackHci_Hardware_Test_PhysicalDisk - the data disks need to be clean and empty. If this command:
Get-PhysicalDisk | Format-Table PhysicalLocation, UniqueId, SerialNumber, CanPool, CannotPoolReason, BusType, MediaType, Size, FriendlyName
shows two SSDs with CanPool false and CannotPoolReason In a Pool, delete the SU1_Pool storage pool and reset the disk with:
Get-StoragePool
Remove-StoragePool -FriendlyName SU1_Pool
Reset-PhysicalDisk -UniqueId eui.0025385551A3CDA1
Reset-PhysicalDisk -UniqueId eui.0025385B4141DD51
Aks Arc requires a Azure Local logical network, I use the next 254 IPs in may range. If you have a bigger Azure Local cluster, use a bigger range. You can click it in Azure portal or use the CLI
192.168.16.0/20
192.168.3.1 - 192.168.3.254
255.255.240.0
Gateway: 192.168.0.1
DNS: 192.168.0.1
Use Azure Portal or the CLI:
Note It’s a good idea to create an ssh key upfront and keep the private key in your password manager - having the private key enables troubleshooting options later on. Create an SSH key as documented here and use it at cluster create in Azure Portal or CLI.
$name="<my cluster>"
$sub="<my subscription>"
$rg="<my resource group>"
$cl="<my custom location>"
$logicnet="<logical network created before>"
$controlplanesize="Standard_A2_v2" # 2 core, 4GB mem
$nodesize="Standard_A4_v2" # 4 cores, 8GB mem
az aksarc create -n $name -g $rg --custom-location $cl --vnet-ids $vnet --control-plane-vm-size $controlplanesize --node-vm-size $nodesize --ssh-key-value .\id_rsa.pub --verbose
To reach deployments, lets add an loadbalancer. I’ll use the next 254 IPs in the range 192.168.4.1 - 192.168.4.254 Enable it over Azure Portal or use the CLI
$resource="subscriptions/$sub/resourceGroups/$rg/providers/Microsoft.Kubernetes/connectedClusters/$name"
$lbName="$name-lb"
$ipRange="192.168.4.1-192.168.4.254"
az k8s-runtime load-balancer enable --resource-uri $resource
az k8s-runtime load-balancer create --load-balancer-name $lbName --resource-uri $resource --addresses $ipRange --advertise-mode ARP
az aksarc get-credentials -g $rg -n $name= --admin
Remote desktop will be disabled, my best way to access the machine (without having AD features) is, guess what? SSH!
Follow the Enable SSH access to Arc-enabled servers isntructions to enable ssh over Azure Portal CLI.