Optimizing Storage Solutions for Edge and 5G Cloud Infrastructure

Source: RedHat Ceph guide: Public Reference Architecture

During the last several months there are lot of advancements on CPU Cores and specially with launch for Cloud Native and 5G https://www.intel.com/content/www/us/en/newsroom/news/processors-accelerate-5g-network-transformation.html#gs.2fscqr it is strongly believed that Scale out Architectures required for Servers are well in place however the missing piece still is Storage and necessary acceleration that is needed to make sure customer Cloud infrastructures and 5G requirements are addressed .

Why Storage is so difficult

Historically storage requirements from workloads are too variant leaving customers only choice to plan Storage architectures based on worst I/O and that is not TCO efficient . Similarly for Telco Infrastructure as whole industry agreed on X86 as reference there was never an agreement on Storage after Openstack initial releases that puts CEPH as heart of Architecture . It has lead to following issues

  1. Many Cloud workloads can never be satisfied with SDS , one such example in vDPI which as per field experience needs 3X more storage nodes compared on SDS compared to Physical SAN
  2. The vSAN architectures are tied with hardware selection e.g just scaling storage without CPU core expansion as needed in many IT and Data centric Applications is not possible
  3. There are many driver issues between vSAN and physical and that needs a complete compatibility check for optimized utilization ,may be below excellent blog will help you clarify this in a bit detail

https://blogs.vmware.com/virtualblocks/2019/03/22/vsan-misconceptions-drivers-and-firmware-dont-matter/

  1. There is at least 30% capacity waste when integrating vSAN type of solutions with phyiscal SAN

Industry Early adoption of Storage in Telco’s Infrastructure

From Day1 Telco’s want to build a software defined and programmable Storage infrastructure but that ideally SAN is not meant to offer this , similarly SAN required both FC and FC Switch while a SDS Solution can integrate using FC over IP ethernet making whole networking following same TCP/IP suite .

That’s why in NFV1.0 our all analysis showed to use a SDS like solution on Dell type R730Xd or R740Xd with 2,2TB storage or using MDS2X00 cabinets or DSS7000 where later is only used for data and I/O sensitive workloads like

+vUDC

+vDRA

+SPS etc

Non Realtime Performance vs Realtime Performances

SDS like Ceph and bluedata certainly solves Storage issues on Scale out performance by adding OSD’s as and when needed while for SAN the controller is the bottleneck , test results in field shows that Ceph using Cloning at image level can improve performance by at least 20% using Scale out architectures.

However the issue lies in Realtime performance which is Read and not Write like when the VNF or CNF starts at boot time , no SDS storage architecture solves this issue in a cost efficient manner and thats why for such special workloads adoption of SDS will always be a question .

Scale and Replication

The most important advantage in favor of SDS is advantages of Scale and Replication , 3 state copy and VM image replications like RBD mirroring to replicate cluster with zero impact vs SAN and physical stroage limited options is certainly a Win for SDS like vSAN and Ceph

Data Plane limitations

With pass through architectures like SR-IOV following will be limited in SDS

  • vMotion
  • DRS
  • Data Layer H/A

By limit i mean not the function but performance for a commercial cloud

Storage SKU optimizations

By far the most difficult issues in Cloud Infrastructure is finding most optimized storage solutions in a rugged environment , the most prevalent use of Rack servers vs Blade servers is also due to fact that SDS can not map to a blade server efficiently as it do to a rack server . Similarly the BOSS is the embedded RAID1 controller with M.2 storage for the operating system. Clearly, the HBA330 will carry a far higher I/O load than the BOSS, but there’s no need to add unnecessary IO load where it can be avoided

As an Architect it is your job to look after such caveats to find the most optimum architecture for your infrastructure

Containers Storage Solutions

Currently storage is a critical piece in Cloud native infrastructure specially its acceleration part which is not well standardized so main functions are realized through CNS CSI (Cloud storage Interface ) drivers which are mature and stable

The Kubernetes vSphere CSI driver is implemented under an architecture called vSphere CNS CSI, which is comprised of two key components: 

  • The CNS in the vCenter Server 
  • The vSphere volume driver in a Kubernetes cluster

In a Kubernetes cluster, CNS provides a volume driver that has two subcomponents—the CSI driver and the syncer. The CSI driver is responsible for volume provisioning; attaching and detaching the volume to VMs; mounting, formatting, and unmounting volumes from the pod within the node VM; and so on. The CSI driver is built as an out-of-tree CSI plugin for Kubernetes. The syncer is responsible for pushing PV, PVC, and pod metadata to CNS

VMware blog

A Deep Dive into the Kubernetes vSphere CSI Driver with TKGI and TKG (vmware.com)

vSphere CSI driver-supported capabilities 

For a list of the most-up-to-date supported capabilities provided by the vSphere CSI driver, refer to below

A screenshot of a cell phone

Description automatically generated

Orchestration

Saad Sheikh View All →

I am a Senior Architect with a passion to architect and deliver solutions addressing business adoption of the Cloud and Automation/Orchestration covering both Telco and IT Applications industry.

My work in carrier Digital transformation involve Architecting and deploying Platforms for both Telco and IT Applications including Clouds both Open stack and container platforms, carrier grade NFV ,SDN and Infra Networking , DevOps CI/CD , Orchestration both NFVO and E2E SO , Edge and 5G platforms for both Consumer and Enterprise business. On DevOps side i am deeply interested in TaaS platforms and journey towards unified clouds including transition strategy for successful migration to the Cloud

Please write to me on snasrullah@swedtel.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: