Homelab : Resilient and low power "datacenter-in-two-boxes" with X10SDV Mini ITX Xeon D 1540/1518, Centos 7.3, Ceph Jewel, ZFS, KVM/Qemu and OpenvSwitch

In this article you will get an overview on how to build a powerful “home lab/datacenter” based on “very cool” open source technologies, in a very reduced footprint : two network racks, 180W power max.

The technical specs are as follows :

  • 3 compute nodes

  • 256GB RAM max capacity (96GB for now)

  • 40 TB of raw storage (30 usable), including redundancy and backups

  • 10GBE communication between nodes (performance for storage reasons and backups

  • all this services for less than 200W, please. :)

for the main following goals :

Main goals

  • highly resilient data and capacities (definitely, I don’t want to loose my family photos and videos, my work archives, my working environment and all administrative stuff about my family)

  • small power footprint, because power is expensive for me (and the planet)

  • Hosting all mails, data (photos, videos, ..)

  • Hosting home automation systems

  • Manage telecom devices (IP phones, LTE and DSL internet access)

Hardware overview

See original image

Overall map

See original image

Computing and storage :

First node (compute/storage) :

Hardware

Mini ITX Chassis : In Win MS04.265P.ATA, 4 bay hot swap

Supermicro X10SDV-TLNF 8-core/16 threads Xeon D 1540 Mini ITX (TDP = 45W)

See original image

64GB RAM DDR4 ECC

256 GB NVMe SSD SM951 Samsung

2x 1Gb/s “Intel I350 Gigabit”

2x 10GBE “Intel Ethernet Connection X552/X557-AT 10GBASE-T”

Storage Shelf

U-NAS 8 Hard drives (5x4TB WD RED) + others

See original image

Software

Centos 7.3

ZFS, Ceph, KVM, OpenvSwitch

Second node (compute/storage) :

Hardware

Mini ITX Chassis : In Win MS04.265P.SATA, 4 bay hot swap

Supermicro X10SDV-4C+-TLN2F 4 core Xeon D 1518 (TDP = 35W)

See original image

32GB RAM DDR4 ECC

2x 1Gb/s “Intel I350 Gigabit”

2x 10GBE “Intel Ethernet Connection X552/X557-AT 10GBASE-T”

4 hard drives (2x5TB WD GREEN + 1TB Hitachi + SSD)

Software

Centos 7.3

ZFS, Ceph (in VMSs), KVM, OpenvSwitch

Third node (compute : management, domotic software..) :

Hardware

See original image

Intel NUC5i3RYK

16 GB RAM DDR3

120GB SSD M.2. Kingston

Software

Centos 7.3

KVM, OpenvSwitch

Rack mount

All the hardware is located in a garage, in two smart network racks (the UPS is located under the racks, near the power outlet)

The two compute/storage nodes in their final location

result

Network, compute and storage rack :

result

KVM/qemu management

I used to manage virtual machines with virt-manager, but I’ve apple-oriented devices, and XQuartz is not a friend. I use wok an all nodes. Waiting for cloudstack, we’ll see in the future :

result

Network topology between the two main compute nodes : bandwidth optimisation and loop management with RSTP

 

By having two great supermicro nodes embedding four 10GE network cards, you may want that

1. all the traffic to be routed (level 2) in the 10GB channels (for perf reasons : backup, data transfer, traffic between storage -ceph- nodes…)

2. when a node is shut down (for any reason), all the traffic being routed by the 1 Gbps network card.

To do that, your switch must support RSTP (or other mechanisms, but RSTP works well, is supported in entry levels network switchs, and supported by openvswitch)

Here is my network topology :

Home Datacenter slide 1.jpg

#

Configuration for vs0 (the same for the two nodes) :

Create the switch

ovs-vsctl add-br vs0

Add the 1Gb/s port

ovs-vsctl add-port vs0 eno1
ovs-vsctl set Bridge vs0 rstp_enable=true
ovs-vsctl set Port eno4 other_config:rstp-port-priority=32
ovs-vsctl set Port eno4 other_config:rstp-path-cost=150

Then, when RSTP is configured (not before :) ), add the 10 Gb/s port

ovs-vsctl add-port vs0 eno3

This should work.

Finally, set the management ip for the internal port of the switch in /etc/sysconfig/network

[root@hyp02 ~]# cat /etc/sysconfig/network-scripts/ifcfg-vs0
DEVICE=vs0
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
HOTPLUG=no
IPADDR=192.168.10.75
GATEWAY=192.168.10.1
PREFIX=24
DNS1=192.168.10.1
DOMAIN=localdomain

First test snapshot : worked well, huh ?

root@hyp01 vol05]# iperf3 -c 192.168.11.2
Connecting to host 192.168.11.2, port 5201
[  4] local 192.168.11.1 port 59982 connected to 192.168.11.2 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.09 GBytes  9.35 Gbits/sec   39    656 KBytes
[  4]   1.00-2.00   sec  1.09 GBytes  9.39 Gbits/sec    0    663 KBytes
[  4]   2.00-3.00   sec  1.09 GBytes  9.40 Gbits/sec    0    686 KBytes
[  4]   3.00-4.00   sec  1.09 GBytes  9.34 Gbits/sec  635    447 KBytes
[  4]   4.00-5.00   sec  1.05 GBytes  9.00 Gbits/sec  144    691 KBytes
[  4]   5.00-6.00   sec  1.09 GBytes  9.36 Gbits/sec    0    707 KBytes
[  4]   6.00-7.00   sec  1.09 GBytes  9.40 Gbits/sec    0    723 KBytes
[  4]   7.00-8.00   sec  1.09 GBytes  9.38 Gbits/sec    0    754 KBytes
[  4]   8.00-9.00   sec  1.09 GBytes  9.35 Gbits/sec  270    632 KBytes
[  4]   9.00-10.00  sec  1.07 GBytes  9.16 Gbits/sec  176    635 KBytes
.........    
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  10.8 GBytes  9.31 Gbits/sec  1264             sender
[  4]   0.00-10.00  sec  10.8 GBytes  9.31 Gbits/sec                  receiver

iperf Done.
[root@hyp01 vol05]# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.11.2, port 33226
[  5] local 192.168.11.1 port 5201 connected to 192.168.11.2 port 33228
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec  1.04 GBytes  8.97 Gbits/sec
[  5]   1.00-2.00   sec  1.06 GBytes  9.13 Gbits/sec
[  5]   2.00-3.00   sec  1.07 GBytes  9.22 Gbits/sec
[  5]   3.00-4.00   sec  1.06 GBytes  9.10 Gbits/sec
[  5]   4.00-5.00   sec  1.07 GBytes  9.16 Gbits/sec
[  5]   5.00-6.00   sec  1.06 GBytes  9.15 Gbits/sec
[  5]   6.00-7.00   sec  1.08 GBytes  9.31 Gbits/sec
[  5]   7.00-8.00   sec  1.07 GBytes  9.22 Gbits/sec
[  5]   8.00-9.00   sec  1.05 GBytes  8.98 Gbits/sec
[  5]   9.00-10.00  sec  1.08 GBytes  9.31 Gbits/sec
[  5]  10.00-10.04  sec  40.6 MBytes  9.43 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-10.04  sec  10.7 GBytes  9.16 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

Done.

Ceph storage cluster

I wrote a post to get a ceph cluster up and running, monitored with grafana/influxdb

Here is my current config :

See original image

Monitoring

I use grafana/telegraf/collectdb to collect all data on all hosts, including internet bandwidth on openwrt..

Internet bandwith and the three hypervisors

See original image

UPS…

See original image

And Ceph ..

See original image

Costs

Here is the total amount of money (not time :) ) for building such a lab :

… working on it..