With Advanced Container Networking Services, which are now widely accessible, you may improve your Azure Kubernetes service’s operational and security capabilities.
Containers and Kubernetes are now the foundation of contemporary application deployments due to the growing popularity of cloud-native technologies. Workloads in containers based on microservices are more portable, resource-efficient, and easy to grow. Organizations may implement cutting-edge AI and machine learning applications across a variety of computational resources by using Kubernetes to manage these workloads, greatly increasing operational productivity at scale. Deep observability and built-in granular security measures are highly desired as application design evolves, however this is difficult due to containers’ transient nature. Azure Advanced Container Networking Services can help with that.
Advanced Container Networking Services for Azure Kubernetes Services (AKS), a cloud-native solution designed specifically to improve security and observability for Kubernetes and containerized environments, is now generally available. Delivering a smooth and integrated experience that enables you to keep strong security postures and obtain comprehensive insights into your network traffic and application performance is the major goal of Advanced Container Networking Services. You can confidently manage and scale your infrastructure since this guarantees that your containerized apps are not only safe but also satisfy your performance and reliability goals.
Let’s examine this release’s observability and container network security features.
Container Network Observability
Although Kubernetes is excellent at coordinating and overseeing various workloads, there is still a significant obstacle to overcome: how can we obtain a meaningful understanding of the interactions between these services? Reliability and security must be guaranteed by keeping an eye on microservices’ network traffic, tracking performance, and comprehending component dependencies. Performance problems, outages, and even possible security threats may go unnoticed in the absence of this degree of understanding.
You need more than just virtual network logs and basic cluster level data to fully evaluate how well your microservices are doing. Granular network metrics, such as node-, pod-, and Domain Name Service (DNS)-level insights, are necessary for thorough network observability. Teams can use these metrics to track the health of each cluster service, solve problems, and locate bottlenecks.
Advanced Container Networking Services offers strong observability features designed especially for Kubernetes and containerized settings to overcome these difficulties. No element of your network is overlooked thanks to Advanced Container Networking Services’ real-time and comprehensive insights spanning node-level, pod-level, Transmission Control Protocol (TCP), and DNS-level data. These indicators are essential for locating performance snags and fixing network problems before they affect workloads.
Among the network observability aspects of Advanced Container Networking Services are:
- Node-level metrics: These metrics give information about the volume of traffic, the number of connections, dropped packets, etc., by node. Grafana can be used to view the metrics, which are saved in Prometheus format.
- Hubble metrics, DNS, and metrics at the pod level: By using Hubble to gather data and using Kubernetes context, such as source and destination pod names and namespace information, Advanced Container Networking Services makes it possible to identify network-related problems more precisely. Traffic volume, dropped packets, TCP resets, L4/L7 packet flows, and other topics are covered by the metrics. DNS metrics that cover DNS faults and unanswered DNS requests are also included.
- Logs of Hubble flow: Flow logs offer insight into workload communication, which facilitates comprehension of the inter-microservice communication. Questions like whether the server received the client’s request are also addressed by flow logs. How long does it take for the server to respond to a client’s request?
- Map of service dependencies: Hubble UI is another tool for visualizing this traffic flow; it displays flow logs for the chosen namespace and builds a service-connection graph from the flow logs.
Container Network Security
The fact that Kubernetes by default permits all communication between endpoints, posing significant security threats, is one of the main issues with container security. Advanced fine-grained network controls employing Kubernetes identities are made possible by Advanced Container Networking Services with Azure CNI powered by Cilium, which only permits authorized traffic and secure endpoints.
External services regularly switch IP addresses, yet typical network policies use IP-based rules to regulate external traffic. Because of this, it is challenging to guarantee and enforce consistent security for workloads that communicate outside of the cluster. Network policies can be protected against IP address changes using the Advanced Container Networking Services’ fully qualified domain name (FQDN) filtering and security agent DNS proxy.
FQDN filtering and security agent DNS proxy
The Cilium Agent and the security agent DNS proxy are the two primary parts of the solution. When combined, they provide for more effective and controllable management of external communications by easily integrating FQDN filtering into Kubernetes clusters.
Cilium Agent
One essential networking component that operates as a DaemonSet in clusters employing Azure CNI powered by Cilium is the Cilium Agent. For pods in the cluster, the agent manages network policies, load balancing, and networking. The Cilium Agent modifies the network policy using the FQDN:IP mappings retrieved from the DNS Proxy and reroutes packets to the DNS Proxy for name resolution for pods with enforced FQDN rules.
Security Agent DNS Proxy
With Advanced Container Networking services enabled, the DNS proxy included in the security agent operates as DaemonSet in Azure CNI powered by Cilium cluster. It manages pod DNS resolution and updates Cilium Agent with FQDN to IP mappings upon successful DNS resolution.
Pods will maintain DNS resolution even if the Cilium agent is unavailable or undergoing an upgrade if the security agent DNS proxy is run in a different daemonset (acns-security-agent) alongside the Cilium agent. The DNS proxy stays up and running throughout upgrades thanks to Kubernetes’ maxSurge upgrade function. This architecture ensures that DNS resolution problems won’t interfere with network connectivity for critical client workloads.
Customer adoption and scenarios
Even during its preview, a large number of internal and external clients implemented Advanced Container Networking Services for the following use cases:
- Using DNS faults and analytics to troubleshoot DNS resolution timeouts and application degradation.
- Pods and applications occasionally lose contact with external endpoints or other pods. Cluster administrators can more quickly troubleshoot connectivity issues by using pod metrics, which include dropped packet counts, TCP failures, and retransmissions.
- Use flow logs to troubleshoot problems with network connectivity.
- Setting Cilium network policies with FQDNs rather than IP addresses significantly streamlines policy management, enabling cluster security and making policies more resilient in the event of IP address changes.
In conclusion
It is crucial that you incorporate security and observability into each tier of your infrastructure as you proceed with your cloud-native journey. You can move more quickly and innovate more when you have the necessary tools in place and know that your workloads are secure and visible.