The use of containers and kubernetes in the industry has made a decent progress since the first time it first made its way back in 2013-2014 . However there is still lot of enigma around its use in a production environment . Recently there is a OpenDev workshop organized by Openstack https://www.openstack.org/events/opendev-2020/ where many domain experts from the industry including Telcos,vendors,system integrators and Enterprise spent a handful of time to clarify many of mis-conceptions and share experiences about how to use in a large scale Enterprise of Telco Environment .
The Purpose of this paper is to share some key insights around this .
How to use containers
There are different industry use cases that need to support different scnerios of deployment . For example some common views are follows
Telco Architects think the containers must be deployed on top of Existing Clouds mainly Openstack or VMware VCF (Through Pacific Project)
Enterprise folks believe containers should run with or without Kubernetes . Mainly wide use support on bare-metal is required
Application or IT guys think everything should run on Kubernetes (A.k.A K8S) . This is same view as the Developers
Build and Test Images
To build images there are different approaches and best approach is to give as many as possible flexibility to the Developer by using base images from where to build . However some best industry recommendations are as follows
- Start from carrier grade images like CentOS . Although it is a little Fat image but it will offset time in troubleshooting and enhancing , a definite value
- Second best aproach is to extract images using Mirror Tags like CoreDNS . This is a favorable direction from IT/Developer view point
- Other approach it to use simple images but with complete support on build utilities , E.g Debian Selenium
- Use of minimum base images like Alpine is also one direction depending on use case
Once the images are built the most important process will be to test and validate them , for this also our best suggestion is to
- Start from base images (So that minimum certification cases already tested)
- First check everything , the deployment approach
- Run tests in isolated environment first following by multi stage CI to separate test from production
- Use Utilities like buildX that can support both X86-64 and AMD Architectures
Which Registries to use in Cloud again depends on use cases and industry .For example for Telco the customer wants to have something adaptable with open stack so use of Zull registry is common followed by obviously Docker and Goharbor . Zull is specially convenient as it can tag/push images to docker hub with Zull jobs with wide use of image scan support using Clair
Docker is still believed to be the native and widely support run time environment specially in its Enterprise offering from Mirantis . The PodMan from RedHat is specially taking popularity however there are still a number of behavior issues in PoDMan specially on bind mounts and that need to be standardized before this move .OCI and CRIO are taking wider community support and i believe by Kubernetes 1.19 they may surpass PodMan .
For Telecom industry due to tenant isolation and security requirements the use of Kata is important , for some workloads like vIMS , vMME it becomes not a matter of software but regulatory to use certain architecture over other .
Deploy Containers in OpenStack
When it comes to deployment of containers on open stack there can be many approaches like in case of Magnum to build a Kubernetes controller or as simple as just a kernel configuration file using a set of utilities like Spyros that ensure complete LCM and fast deployment of containers on VM’s .
Similarly containers can use storage from Openstack in a number of ways including
- Cinder API
- Manila using NFS or Ceph FS
- Open ebs
Obviously like in openstack the ephemeral storage has disadvantages like you can not know the implementation of provider and that is why implementation using Ceph3.0/Rook looks like the best direction in a hybrid cloud environment
Using the Containers
Networking and exposing containers outside is still a debatable topic and shall be the subject of separate writeup primary due to reasons that many workloads are still stateful and NIC is not floating instance for many workload specially in Telecom . Having said this still there are some suggestion to access containers in a stanard way like
- Use of Floating IP like in Calico and Flannel
- Customer CRD’s
Again if we are deploying these solution on openstack we may need to use some encapsulation solutions like Kuryr to avoid double encapsulation or disable port security and supplement it using kube router or calico
Cloud Provider SIG
If you are a Telecom provider who already built Telco Cloud in recent years than this will be something really important for you as Cloud Provider supports a way to integrate Kubernetes (K8S) in Openstack using a number of cluster management tools like
- Cluster API
HPC and Scientific SIG
HPC use cases are becoming extremely important in Telco’s primarily due to ushering of new Tech wave and use cases around Cloud and 5G .
NVIDIA T-Series GPU is specially popular to run ML/AI workloads in Telecom . It can support high performance on VM’s using efficient resource utilization like 1:4 and for containers 1:8 by exposing GPU’s to VM’s running Kubernetes . In addition for special use cases like GIS and Image profiling can support pass through like the famous SR-IOV use cases of Telecom 5G CNF’s like UPF .
In a nutshell the containers are ready for production . However just like other cloud solution there is no one picture that fits all screens so a careful selection of components and solutions is required to ensure maximum advantage coming from the Cloud .This is why to ensure as community and industry we do not miss the boat like somehow we experience in Openstack VM journey it is very important to define and standardize both the consumption models and deployments scnerios that can support to achieve a real carrier grade evolution to containers .The Cloud iNFrastructure Telco Taskforce (CNTT) has recently launched new initiative to help bring focus on cloud-native network functions (CNF) and Kubernetes based platforms. A working group within Reference architecture 2 ( K8s based ), RA-2 has kicked off a short survey to collect data on Kubernetes adoption in telecom. The link is below , i do expect you will play active part to share your insights to uplift the infrastructure to the Cloud Native era .
I am a Senior Architect with a passion to architect and deliver solutions addressing business adoption of the Cloud and Automation/Orchestration covering both Telco and IT Applications industry.
My work in carrier Digital transformation involve Architecting and deploying Platforms for both Telco and IT Applications including Clouds both Open stack and container platforms, carrier grade NFV ,SDN and Infra Networking , DevOps CI/CD , Orchestration both NFVO and E2E SO , Edge and 5G platforms for both Consumer and Enterprise business. On DevOps side i am deeply interested in TaaS platforms and journey towards unified clouds including transition strategy for successful migration to the Cloud
Please write to me on email@example.com