Hardware consideration:
I am deploying a disaggregated S2D scenario, consisting of 3x clusters each with 3 nodes.
- 2x Compute Clusters (one with a lot of CPU cores the other with high-frequency CPU), each node:
- ThinkSystem SR630 2.5" Chassis with 8 Bays
- 2x Intel Xeon Gold 6244 8C 150W 3.6GHz Processor/
- 24x ThinkSystem 64GB TruDDR4 2933MHz (2Rx4 1.2V) RDIMM
- 1x ThinkSystem M.2 with Mirroring Enablement Kit
- 2x ThinkSystem M.2 5100 240GB SATA 6Gbps Non-Hot Swap SSD
2x Mellanox ConnectX-4 Lx 10/25GbE SFP28 2-port PCIe Ethernet Adapter 2x ThinkSystem 750W (230/115V) Platinum Hot-Swap Power Supply - 1x ThinkSystem 1Gb 2-port RJ45 LOM
- 1x ThinkSystem XClarity Controller Standard to Enterprise Upgrade
- 1x S2D Cluster, each node
- ThinkAgile MX Certified Node - All Flash
- 2x Intel Xeon Silver 4210 10C 85W 2.2GHz Processor
- 12x ThinkSystem 32GB TruDDR4 2666 MHz (2Rx4 1.2V) RDIMM
- 3x ThinkSystem 430-8i SAS/SATA 12Gb HBA
- 16x ThinkSystem 2.5" Intel S4510 3.84TB Entry SATA 6Gb Hot Swap SSD
- 4x ThinkSystem U.2 Intel P4610 1.6TB Mainstream NVMe PCIe3.0 x4 Hot Swap SSD
- 1x Mellanox ConnectX-4 Lx 10/25GbE SFP28 2-port PCIe Ethernet Adapter
- 1x ThinkSystem 1Gb 2-port RJ45 LOM
- 2x ThinkSystem 1100W (230V/115V) Platinum Hot-Swap Power Supply
- 1x ThinkSystem M.2 with Mirroring Enablement Kit
- 2x ThinkSystem M.2 5100 480GB SATA 6Gbps Non-Hot Swap SSD
- 1x ThinkSystem XClarity Controller Standard to Enterprise Upgrade
- Networking
- 2x Lenovo NE2572, each:
- 48x SFP28/SFP+ ports (25Gbit/s)
- 6x QSFP28/QSFP+ ports (100Gbit/s)
- Service Server
- ThinkSystem SR630 2.5" Chassis with 8 Bays
- 1x Intel Xeon Silver 4208 8C 85W 2.1GHz Processor
- 4x ThinkSystem 32GB TruDDR4 2933MHz (2Rx4 1.2V) RDIMM
- 1x ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter
- 2x ThinkSystem 2.5" 5200 1.92TB Mainstream SATA 6Gb Hot Swap SSD
- ThinkSystem M.2 with Mirroring Enablement Kit
- 2x ThinkSystem M.2 5100 240GB SATA 6Gbps Non-Hot Swap SSD
- ThinkSystem 1Gb 4-port RJ45 LOM
- 2x ThinkSystem 750W (230/115V) Platinum Hot-Swap Power Supply
- S2D Nodes have to be certified (in order to receive support from the vendor and Microsoft) https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-hardware-requirements
- The certification program name was called Windows Server Software-Defined (WSSD), now its called Azure Stack HCI (HCI)
- However, in the Windows server catalog, it is still referred to as SDDC certification: https://www.windowsservercatalog.com/ but to make it more confusing it is prefabricated here as well: https://www.microsoft.com/en-us/cloud-platform/azure-stack-hci-catalog
- SDDC Premium vs Standard
- An only small subset of configurations have passed the certification, so the vendors usually offer few combinations of different SSD/NVMe/HDDs and network cards.
- You may want your compute cluster/nodes to pass the SDDC Premium certification as well. While it is not a requirement and the compute nodes will not run S2D, you want to utilize features such as d.VMMq (which everyone else but MS is calling RSSv2) for which a proper drivers are a must https://techcommunity.microsoft.com/t5/networking-blog/synthetic-accelerations-in-a-nutshell-windows-server-2019/ba-p/653976
- Windows Server Catalog claims that Lenovo SR630 have passed the SDDC Premium certification...
- Without XClarity Controller Enterprise upgrade you will not be able to use Lenovo's tool for Bare-metal deloyment (which might not be an issue, just stating facts:))
- S2D calculator to check sizing https://aka.ms/s2dcalc
- Each compute node is having 2x 2-ports Mellanox network card so that I can dedicate two 25Gbit ports for VM traffic and another two ports for S2D traffic. This simplifies the deployment greatly
- Consider the RAM sizing. Not only how much GB you need but also the slots.
- While this is something any good vendor should help you with (as well as proper placing to PCIe of NVMe cards/Mellanox cards etc..) just reminder that for S2D nodes you are looking for balanced configurations. In my case I have to follow: https://lenovopress.com/lp1089-balanced-memory-configurations-xeon-scalable-gen-2
- Why? Because Software Defined anything performs the operations in the memory.
- Deduplicaton - will run S2D nodes / https://docs.microsoft.com/en-us/windows-server/storage/data-deduplication/install-enable
- Service server is a must - you may utilize existing infrastructure however you will greatly appreciate a server for deployment/maintenance and troubleshooting. Something where you can run DC, VMM, Monitoring and/or other tools (bera metal deploy, configuration...)
- You may want to use "Keep your harddrive" support option for S2D nodes if not running shielded VMs https://docs.microsoft.com/en-us/windows-server/security/guarded-fabric-shielded-vm/guarded-fabric-and-shielded-vms
- Backup for the VM's - do not forget to consider it as a part of the whole solution
No comments:
Post a Comment