当前位置:   article > 正文

k8s系列---网络插件flannel

k8s flannel directrouting

跨节点通讯,需要通过NAT,即需要做源地址转换

    k8s网络通信: 

        1) 容器间通信:同一个pod内的多个容器间的通信,通过lo即可实现; 

        2) pod之间的通信,pod ip <---> pod ip,pod和pod之间要不经过任何转换即可通信; 

        3) pod和service通信:pod ip <----> cluster ip(即service ip)<---->pod ip,他们通过iptables或ipvs实现通信,另外大家要注意ipvs取代不了iptables,因为ipvs只能做负载均衡,而做不了nat转换; 

        4) Service与集群外部客户端的通信 

 

  1. [root@master pki]# kubectl get configmap -n kube-system
  2. NAME DATA AGE
  3. coredns 1 22d
  4. extension-apiserver-authentication 6 22d
  5. kube-flannel-cfg 2 22d
  6. kube-proxy 2 22d
  7. kubeadm-config 1 22d
  8. kubelet-config-1.11 1 22d
  9. kubernetes-dashboard-settings 1 9h

  

  1. [root@master pki]# kubectl get configmap kube-proxy -o yaml -n kube-system
  2. mode: ""

  

   看到mode是空的,我们把它改为ipvs就可以了。 

    k8s要靠CNI接口接入其他插件来实现网络通讯。目前比较流行的插件有flannet,callco,canel,kube-router。 

    这些插件使用的解决方案都如下: 

    1)虚拟网桥,虚拟网卡,多个容器共用一个虚拟网卡进行通信; 

    2)多路复用:MacVLAN,多个容器共用一个物理网卡进行通信; 

    3)硬件交换:SR-LOV,一个物理网卡可以虚拟出多个接口,这个性能最好。 

 CNI插件存放位置 

  1. [root@master ~]# cat /etc/cni/net.d/10-flannel.conflist
  2. {
  3. "name": "cbr0",
  4. "plugins": [
  5. {
  6. "type": "flannel",
  7. "delegate": {
  8. "hairpinMode": true,
  9. "isDefaultGateway": true
  10. }
  11. },
  12. {
  13. "type": "portmap",
  14. "capabilities": {
  15. "portMappings": true
  16. }
  17. }
  18. ]
  19. }

  

  flanel只支持网络通讯,但是不支持网络策略。 

    callco网络通讯和网络策略都支持。

    canel:flanel+callco合起来的功能。

 

    我们可以部署flanel提供网络通讯,再部署一个callco只提供网络策略。而不用canel。 

    mtu:是指一种通信协议的某一层上面所能通过的最大数据包大小。

  1. [root@master ~]# ifconfig
  2. cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
  3. inet 10.244.0.1 netmask 255.255.255.0 broadcast 0.0.0.0
  4. inet6 fe80::4097:d5ff:fe28:6b64 prefixlen 64 scopeid 0x20<link>
  5. ether 0a:58:0a:f4:00:01 txqueuelen 1000 (Ethernet)
  6. RX packets 1609844 bytes 116093191 (110.7 MiB)
  7. RX errors 0 dropped 0 overruns 0 frame 0
  8. TX packets 1632952 bytes 577989701 (551.2 MiB)
  9. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  10. docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
  11. inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
  12. ether 02:42:83:f8:b8:ff txqueuelen 0 (Ethernet)
  13. RX packets 0 bytes 0 (0.0 B)
  14. RX errors 0 dropped 0 overruns 0 frame 0
  15. TX packets 0 bytes 0 (0.0 B)
  16. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  17. ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
  18. inet 172.16.1.100 netmask 255.255.255.0 broadcast 172.16.1.255
  19. inet6 fe80::9cf3:d9de:59f:c320 prefixlen 64 scopeid 0x20<link>
  20. inet6 fe80::5707:6115:267b:bff5 prefixlen 64 scopeid 0x20<link>
  21. inet6 fe80::e34:f952:2859:4c69 prefixlen 64 scopeid 0x20<link>
  22. ether 00:50:56:a2:4e:cb txqueuelen 1000 (Ethernet)
  23. RX packets 5250378 bytes 704067861 (671.4 MiB)
  24. RX errors 139 dropped 190 overruns 0 frame 0
  25. TX packets 4988169 bytes 4151179300 (3.8 GiB)
  26. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  27. flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
  28. inet 10.244.0.0 netmask 255.255.255.255 broadcast 0.0.0.0
  29. inet6 fe80::a82c:bcff:fef8:895c prefixlen 64 scopeid 0x20<link>
  30. ether aa:2c:bc:f8:89:5c txqueuelen 0 (Ethernet)
  31. RX packets 51 bytes 3491 (3.4 KiB)
  32. RX errors 0 dropped 0 overruns 0 frame 0
  33. TX packets 53 bytes 5378 (5.2 KiB)
  34. TX errors 0 dropped 10 overruns 0 carrier 0 collisions 0
  35. lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
  36. inet 127.0.0.1 netmask 255.0.0.0
  37. inet6 ::1 prefixlen 128 scopeid 0x10<host>
  38. loop txqueuelen 1 (Local Loopback)
  39. RX packets 59118846 bytes 15473986573 (14.4 GiB)
  40. RX errors 0 dropped 0 overruns 0 frame 0
  41. TX packets 59118846 bytes 15473986573 (14.4 GiB)
  42. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  43. veth6ec94aab: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
  44. inet6 fe80::487d:5bff:fef7:484d prefixlen 64 scopeid 0x20<link>
  45. ether 4a:7d:5b:f7:48:4d txqueuelen 0 (Ethernet)
  46. RX packets 88112 bytes 19831802 (18.9 MiB)
  47. RX errors 0 dropped 0 overruns 0 frame 0
  48. TX packets 105718 bytes 13343894 (12.7 MiB)
  49. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  50. vethf703483a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
  51. inet6 fe80::b06a:eaff:fec3:33a8 prefixlen 64 scopeid 0x20<link>
  52. ether b2:6a:ea:c3:33:a8 txqueuelen 0 (Ethernet)
  53. RX packets 760882 bytes 59400960 (56.6 MiB)
  54. RX errors 0 dropped 0 overruns 0 frame 0
  55. TX packets 763263 bytes 282299805 (269.2 MiB)
  56. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  57. vethff579703: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
  58. inet6 fe80::d82f:37ff:fe9a:b6d0 prefixlen 64 scopeid 0x20<link>
  59. ether da:2f:37:9a:b6:d0 txqueuelen 0 (Ethernet)
  60. RX packets 760850 bytes 59398245 (56.6 MiB)
  61. RX errors 0 dropped 0 overruns 0 frame 0
  62. TX packets 764016 bytes 282349248 (269.2 MiB)
  63. TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

  

 通过ifconfig命令,我们可以看到flannel.1的地址是10.244.0.0,子网掩码是255.255.255.255,mtu是1450,mtu要留出一部分做封装叠加,额外开销使用。 

    cni0只有在pod运行时才会出现。

    两个节点上的pod可以借助flannel隧道进行通信。默认使用的VxLAN协议,因为它有额外开销,所以性能有点低。 

    flannel第二种协议叫host-gw(host gateway),即Node节点把自己的网络接口当做pod的网关使用,从而使不同节点上的node进行通信,这个性能比VxLAN高,因为它没有额外开销。不过他有个缺点, 就是各node节点必须在同一个网段中 。 

     另外,如果两 个pod所在节点在同一个网段中 ,可以让VxLAN也支持host-gw的功能, 即直接通过物理网卡的网关路由转发,而不用隧道flannel叠加,从而提高了VxLAN的性能,这种flannel的功能叫directrouting。

  1. [root@master ~]# kubectl get daemonset -n kube-system
  2. NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
  3. kube-flannel-ds-amd64 3 3 3 3 3 beta.kubernetes.io/arch=amd64

  

  1. [root@master ~]# kubectl get pods -n kube-system -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE
  3. kube-flannel-ds-amd64-6zqzr 1/1 Running 8 22d 172.16.1.100 master
  4. kube-flannel-ds-amd64-7qtcl 1/1 Running 7 22d 172.16.1.101 node1
  5. kube-flannel-ds-amd64-kpctn 1/1 Running 6 22d 172.16.1.102 node2

  

    看到flannel是以pod的daemonset控制器形式运行的(其实flannel还可以以守护进程的方式运行)。

 

  1. [root@master ~]# kubectl get configmap -n kube-system
  2. NAME DATA AGE
  3. kube-flannel-cfg 2 22d

  

  1. [root@master ~]#kubectl get configmap -n kube-system kube-flannel-cfg -o json -n kube-system
  2. \\\"10.244.0.0/16\\\",\\n \\\"Backend\\\": {\\n \\\"Type\\\": \\\"vxlan\

  

   flannel的配置参数: 

        1、network :flannel使用的CIDR格式的网络地址,用于为pod配置网络功能。 

            1)10.244.0.0/16---> 

                    master: 10.244.0.0./24 

                    node01: 10.244.1.0/24 

                    .... 

                    node255: 10.244.255.0/24 

                可以支持255个节点 

             2)10.0.0.0/8 

                    10.0.0.0/24 

                    ... 

                    10.255.255.0/24 

                可以支持6万多个节点 

         2、SubnetLen :把network切分为子网供各节点使用时,使用多长的掩码进行切分,默认为24位; 

         3、SubnetMin :指明子网中的地址段最小多少可以分给子网使用,比如可以限制10.244.10.0/24,这样0~9就不让用; 

         4、SubnetMax :表示最多使用多少个,比如10.244.100.0/24 

         5、Backend: Vxlan,host-gw,udp(最慢) 

    

flannel

    支持多种后端

    Vxlan

        1.valan

        2.Dirextrouting

    host-gw:Host Gateway  #不推荐,只能在二层网络中,不支持跨网络,如果有成千上万的Pod,容易产生广播风暴

    UDP:性能差

 

  1. [root@master ~]# kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE
  3. myapp-deploy-69b47bc96d-79fqh 1/1 Running 4 7d 10.244.1.97 node1
  4. myapp-deploy-69b47bc96d-tc54k 1/1 Running 4 7d 10.244.2.88 node2

  

  1. [root@master ~]# kubectl exec -it myapp-deploy-69b47bc96d-79fqh -- /bin/sh
  2. / # ping 10.244.2.88 #ping对方Node上容器的ip
  3. PING 10.244.2.88 (10.244.2.88): 56 data bytes
  4. 64 bytes from 10.244.2.88: seq=0 ttl=62 time=0.459 ms
  5. 64 bytes from 10.244.2.88: seq=0 ttl=62 time=0.377 ms
  6. 64 bytes from 10.244.2.88: seq=1 ttl=62 time=0.252 ms
  7. 64 bytes from 10.244.2.88: seq=2 ttl=62 time=0.261 ms

  

    在其他节点上抓包,发现在ens192上抓不到包。所以没走ens192

[root@master ~]# tcpdump -i ens192 -nn icmp

  

[root@master ~]# yum install bridge-utils -y

  

  1. [root@master ~]# brctl show docker0
  2. bridge namebridge idSTP enabledinterfaces
  3. docker08000.024283f8b8ffno

  

  1. [root@master ~]# brctl show cni0
  2. bridge namebridge idSTP enabledinterfaces
  3. cni08000.0a580af40001noveth6ec94aab
  4. vethf703483a
  5. vethff579703

  

  可以看到veth这些接口都是桥接到cni0上的。

    brctl show表示查看已有网桥。

  1. [root@node1 ~]# tcpdump -i cni0 -nn icmp
  2. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  3. listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
  4. 23:40:11.370754 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 96, length 64
  5. 23:40:11.370988 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 96, length 64
  6. 23:40:12.370888 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 97, length 64
  7. 23:40:12.371090 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 97, length 64
  8. ^X23:40:13.371015 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 98, length 64
  9. 23:40:13.371239 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 98, length 64
  10. 23:40:14.371128 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 99, length 64

  

    可以看到,在node节点,可以在cni0端口上抓到容器里面的Ping时的包。

    其实,上面ping时的数据流是先从cni0进来,然后从flannel.1出去,最后借助物理网卡ens32发出去。所以,我们在flannel.1上也能抓到包:

  1. [root@node1 ~]# tcpdump -i flannel.1 -nn icmp
  2. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  3. listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
  4. 03:12:36.823315 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 12840, length 64
  5. 03:12:36.823496 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 12840, length 64
  6. 03:12:37.823490 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 12841, length 64
  7. 03:12:37.823634 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 12841, length 64

  

  同样,在ens192物理网卡上也能抓到包: 

  1. [root@node1 ~]# tcpdump -i ens192 -nn host 172.16.1.102 #172.16.1.102是node2的物理ip
  2. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  3. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  4. 10:59:24.234174 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  5. IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 0, length 64
  6. 10:59:24.234434 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  7. IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 0, length 64
  8. 10:59:25.234301 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  9. IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 1, length 64
  10. 10:59:25.234469 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  11. IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 1, length 64
  12. 10:59:26.234415 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  13. IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 2, length 64
  14. 10:59:26.234592 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  15. IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 2, length 64
  16. 10:59:27.234528 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
  17. IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 3, length 64

  

 下面我们把flannel的通信模式改成directrouting的方式 ,从Git上下载配置文件,重新删除网络在重新应用,这个步骤不推荐。但是视频就这么做的。作者是修改源文件,然后重启了k8s集群,他的这个方式造成pod后续创建的都处于pendding状态。

  1. https://github.com/coreos/flannel
  2. wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
  3. 找到
  4. "Network": "10.244.0.0/16",
  5. "Backend": {
  6. "Type": "vxlan",
  7. "Directrouting": true //新增这一行。上面记得加逗号

 

先删除之前的flannel,生产环境不要这么干

[root@master flannel]# kubectl delete -f kube-flannel.yml 

  

创建新的

[root@master flannel]# kubectl get pods -n kube-system 
  1. [root@master flannel]# kubectl get configmap kube-flannel-cfg -o json -n kube-system
  2. "net-conf.json": "{\n \"Network\": \"10.244.0.0/16\",\n \"Backend\": {\n \"Type\": \"vxlan\",\n \"Directrouting\": true\n }\n}\n"

  

看到有Directrouting,说明生效了。

 

  1. [root@master ~]# ip route show
  2. default via 172.16.1.254 dev ens192 proto static metric 100
  3. 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 #访问本机直接在本机直接转发,而不需要其他接口,这就是directrouting
  4. 10.244.1.0/24 via 172.16.1.101 dev ens192 #看到现在访问10.244.1.0,通过本地物理网卡ens192上的172.16.1.101送出去,即通过物理网卡通信了,而不再通过隧道flannel通信。
  5. 10.244.2.0/24 via 172.16.1.102 dev ens192
  6. 172.16.1.0/24 dev ens192 proto kernel scope link src 172.16.1.100 metric 100
  7. 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

  

继续登录到一个pod中进行ping测试: 

 

  1. [root@master ~]# kubectl get pods -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE
  3. myapp-deploy-69b47bc96d-75g2b 1/1 Running 0 12m 10.244.1.124 node1
  4. myapp-deploy-69b47bc96d-jwgwm 1/1 Running 0 3s 10.244.2.100 node2

  

  1. [root@master ~]# kubectl exec -it myapp-deploy-69b47bc96d-75g2b -- /bin/sh
  2. / # ping 10.244.2.100
  3. PING 10.244.2.100 (10.244.2.100): 56 data bytes
  4. 64 bytes from 10.244.2.100: seq=0 ttl=62 time=0.536 ms
  5. 64 bytes from 10.244.2.100: seq=1 ttl=62 time=0.206 ms
  6. 64 bytes from 10.244.2.100: seq=2 ttl=62 time=0.206 ms
  7. 64 bytes from 10.244.2.100: seq=3 ttl=62 time=0.203 ms
  8. 64 bytes from 10.244.2.100: seq=4 ttl=62 time=0.210 ms

  

  1. [root@node1 ~]# tcpdump -i ens192 -nn icmp
  2. tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
  3. listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
  4. 12:31:10.899403 IP 10.244.1.124 > 10.244.2.100: ICMP echo request, id 8960, seq 24, length 64
  5. 12:31:10.899546 IP 10.244.2.100 > 10.244.1.124: ICMP echo reply, id 8960, seq 24, length 64
  6. 12:31:11.899505 IP 10.244.1.124 > 10.244.2.100: ICMP echo request, id 8960, seq 25, length 64
  7. 12:31:11.899639 IP 10.244.2.100 > 10.244.1.124: ICMP echo reply, id 8960, seq 25, length 64

  

  通过抓包可以看到,现在在pod中进行互ping,是从物理网卡ens192进出的,这就是directrouting,这种性能比默认vxlan高。 

[root@node 1  ~]#  tcpdump -i cni 0  -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni 0 , link-type EN 10 MB (Ethernet), capture size  262144  bytes
23: 40: 11.370754  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  96 , length  64
23: 40: 11.370988  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  96 , length  64
23: 40: 12.370888  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  97 , length  64
23: 40: 12.371090  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  97 , length  64
^X 23: 40: 13.371015  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  98 , length  64
23: 40: 13.371239  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  98 , length  64
23: 40: 14.371128  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  99 , length  64

转载于:https://www.cnblogs.com/dribs/p/10318200.html

本文内容由网友自发贡献,转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/128345
推荐阅读
  

闽ICP备14008679号