当前位置:   article > 正文

linux网络流程分析(一)---网卡驱动

linux network driver

分析linux网络的书已经很多了,包括《追踪Linux TCP/IP代码运行》《Linux内核源码剖析——TCP/IP实现》,这里我只是从数据包在linux内核中的基本流程来分析,尽可能的展现一个主流程框架。

内核如何从网卡接收数据,传统的过程:
1.数据到达网卡;
2.网卡产生一个中断给内核;
3.内核使用I/O指令,从网卡I/O区域中去读取数据;

 
我们在许多网卡驱动中(很老那些),都可以在网卡的中断函数中见到这一过程。
 
但是,这一种方法,有一种重要的问题,就是大流量的数据来到,网卡会产生大量的中断,内核在中断上下文 中,会浪费大量的资源来处理中断本身。所以,就有一个问题,“可不可以不使用中断”,这就是轮询技术,所谓NAPI技术,说来也不神秘,就是说,内核屏蔽 中断,然后隔一会儿就去问网卡,“你有没有数据啊?”……
 
从这个描述本身可以看到,如果数据量少,轮询同样占用大量的不必要的CPU资源,大家各有所长吧
 
OK,另一个问题,就是从网卡的I/O区域,包括I/O寄存器或I/O内存中去读取数据,这都要CPU 去读,也要占用CPU资源,“CPU从I/O区域读,然后把它放到内存(这个内存指的是系统本身的物理内存,跟外设的内存不相干,也叫主内存)中”。于是 自然地,就想到了DMA技术——让网卡直接从主内存之间读写它们的I/O数据,CPU,这儿不干你事,自己找乐子去:
1.首先,内核在主内存中为收发数据建立一个环形的缓冲队列(通常叫DMA环形缓冲区)。
2.内核将这个缓冲区通过DMA映射,把这个队列交给网卡;
3.网卡收到数据,就直接放进这个环形缓冲区了——也就是直接放进主内存了;然后,向系统产生一个中断;
4.内核收到这个中断,就取消DMA映射,这样,内核就直接从主内存中读取数据;
 
——呵呵,这一个过程比传统的过程少了不少工作,因为设备直接把数据放进了主内存,不需要CPU的干预,效率是不是提高不少?
 
对应以上4步,来看它的具体实现:
1)分配环形DMA缓冲区
Linux内核中,用skb来描述一个缓存,所谓分配,就是建立一定数量的skb,然后用e1000_rx_ring 环形缓冲区队列描述符连接起来
2)建立DMA映射
内核通过调用
dma_map_single(struct device *dev,void *buffer,size_t size,enum dma_data_direction direction)
建立映射关系。
struct device *dev 描述一个设备;
buffer:把哪个地址映射给设备;也就是某一个skb——要映射全部,当然是做一个双向链表的循环即可;
size:缓存大小;
direction:映射方向——谁传给谁:一般来说,是“双向”映射,数据在设备和内存之间双向流动;
对于PCI设备而言(网卡一般是PCI的),通过另一个包裹函数pci_map_single,这样,就把buffer交给设备了!设备可以直接从里边读/取数据。
3)这一步由硬件完成;
4)取消映射
dma_unmap_single,对PCI而言,大多调用它的包裹函数pci_unmap_single,不取消的话,缓存控制权还在设备手里,要调用 它,把主动权掌握在CPU手里——因为我们已经接收到数据了,应该由CPU把数据交给上层网络栈;当然,不取消之前,通常要读一些状态位信息,诸如此类, 一般是调用dma_sync_single_for_cpu()让CPU在取消映射前,就可以访问DMA缓冲区中的内容

首先,数据包从网卡光电信号来之后,先经过网卡驱动,转换成skb,进入链路层,那么我首先就先分析一下网卡驱动的流程。

源码位置:Driver/net/E1000e文件夹下面。

  1. static int __init e1000_init_module(void)
  2. {注册网卡驱动,按照PCI驱动开发方式来进行注册
  3. int ret;
  4. printk(KERN_INFO "%s: Intel(R) PRO/1000 Network Driver - %s\n",
  5. e1000e_driver_name, e1000e_driver_version);
  6. printk(KERN_INFO "%s: Copyright (c) 1999-2008 Intel Corporation.\n",
  7. e1000e_driver_name);
  8. ret = pci_register_driver(&e1000_driver);
  9. pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, e1000e_driver_name,
  10. PM_QOS_DEFAULT_VALUE);
  11. return ret;
  12. }

  然后看一下驱动结构体内容,这里不对PCI类型驱动开发做介绍了。

  1. /* PCI Device API Driver */
  2. static struct pci_driver e1000_driver = {
  3. .name = e1000e_driver_name,
  4. .id_table = e1000_pci_tbl,
  5. .probe = e1000_probe,
  6. .remove = __devexit_p(e1000_remove),
  7. #ifdef CONFIG_PM
  8. /* Power Management Hooks */
  9. .suspend = e1000_suspend,
  10. .resume = e1000_resume,
  11. #endif
  12. .shutdown = e1000_shutdown,
  13. .err_handler = &e1000_err_handler
  14. };

  这里面最重要的函数是e1000_probe,先看一下这个函数的作用是什么:“Device Initialization Routine”,这个应该不难理解。

  1. static int __devinit e1000_probe(struct pci_dev *pdev,
  2. const struct pci_device_id *ent)
  3. {
  4. struct net_device *netdev;
  5. struct e1000_adapter *adapter;
  6. struct e1000_hw *hw;
  7. const struct e1000_info *ei = e1000_info_tbl[ent->driver_data];
  8. resource_size_t mmio_start, mmio_len;
  9. resource_size_t flash_start, flash_len;
  10. static int cards_found;
  11. int i, err, pci_using_dac;
  12. u16 eeprom_data = 0;
  13. u16 eeprom_apme_mask = E1000_EEPROM_APME;
  14. e1000e_disable_l1aspm(pdev);
  15. 从这里开始对设备驱动进行初始化,包括名称、内存之类的。
  16. err = pci_enable_device_mem(pdev);
  17. if (err)
  18. return err;
  19. pci_using_dac = 0;
  20. err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
  21. if (!err) {
  22. err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
  23. if (!err)
  24. pci_using_dac = 1;
  25. } else {
  26. err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
  27. if (err) {
  28. err = pci_set_consistent_dma_mask(pdev,
  29. DMA_BIT_MASK(32));
  30. if (err) {
  31. dev_err(&pdev->dev, "No usable DMA "
  32. "configuration, aborting\n");
  33. goto err_dma;
  34. }
  35. }
  36. }
  37. err = pci_request_selected_regions_exclusive(pdev,
  38. pci_select_bars(pdev, IORESOURCE_MEM),
  39. e1000e_driver_name);
  40. if (err)
  41. goto err_pci_reg;
  42. /* AER (Advanced Error Reporting) hooks */
  43. err = pci_enable_pcie_error_reporting(pdev);
  44. if (err) {
  45. dev_err(&pdev->dev, "pci_enable_pcie_error_reporting failed "
  46. "0x%x\n", err);
  47. /* non-fatal, continue */
  48. }
  49. pci_set_master(pdev);
  50. /* PCI config space info */
  51. err = pci_save_state(pdev);
  52. if (err)
  53. goto err_alloc_etherdev;
  54. err = -ENOMEM;这里要为驱动分配一个容器之类的,因为驱动后面的一切操作都是在它的基础之上。
  55. netdev = alloc_etherdev(sizeof(struct e1000_adapter));
  56. if (!netdev)
  57. goto err_alloc_etherdev;
  58. SET_NETDEV_DEV(netdev, &pdev->dev);
  59. pci_set_drvdata(pdev, netdev);
  60. adapter = netdev_priv(netdev);
  61. hw = &adapter->hw;
  62. adapter->netdev = netdev;
  63. adapter->pdev = pdev;
  64. adapter->ei = ei;
  65. adapter->pba = ei->pba;
  66. adapter->flags = ei->flags;
  67. adapter->flags2 = ei->flags2;
  68. adapter->hw.adapter = adapter;
  69. adapter->hw.mac.type = ei->mac;
  70. adapter->max_hw_frame_size = ei->max_hw_frame_size;
  71. adapter->msg_enable = (1 << NETIF_MSG_DRV | NETIF_MSG_PROBE) - 1;
  72. 0表示设备映射的内存的的bar
  73. mmio_start = pci_resource_start(pdev, 0);
  74. mmio_len = pci_resource_len(pdev, 0);
  75. err = -EIO;这里我的理解是容器的硬件地址与bar进行映射,hw_addr代表的是网卡的硬件地址
  76. adapter->hw.hw_addr = ioremap(mmio_start, mmio_len);
  77. if (!adapter->hw.hw_addr)
  78. goto err_ioremap;
  79. if ((adapter->flags & FLAG_HAS_FLASH) &&
  80. (pci_resource_flags(pdev, 1) & IORESOURCE_MEM)) {
  81. flash_start = pci_resource_start(pdev, 1);
  82. flash_len = pci_resource_len(pdev, 1);
  83. adapter->hw.flash_address = ioremap(flash_start, flash_len);
  84. if (!adapter->hw.flash_address)
  85. goto err_flashmap;
  86. }
  87. /* construct the net_device struct */
  88. netdev->netdev_ops = &e1000e_netdev_ops;
  89. e1000e_set_ethtool_ops(netdev);
  90. netdev->watchdog_timeo = 5 * HZ;
  91. netif_napi_add(netdev, &adapter->napi, e1000_clean, 64);
  92. strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1);
  93. netdev->mem_start = mmio_start;
  94. netdev->mem_end = mmio_start + mmio_len;
  95. adapter->bd_number = cards_found++;
  96. e1000e_check_options(adapter);
  97. /* setup adapter struct */
  98. err = e1000_sw_init(adapter);
  99. if (err)
  100. goto err_sw_init;
  101. err = -EIO;
  102. memcpy(&hw->mac.ops, ei->mac_ops, sizeof(hw->mac.ops));
  103. memcpy(&hw->nvm.ops, ei->nvm_ops, sizeof(hw->nvm.ops));
  104. memcpy(&hw->phy.ops, ei->phy_ops, sizeof(hw->phy.ops));
  105. err = ei->get_variants(adapter);
  106. if (err)
  107. goto err_hw_init;
  108. if ((adapter->flags & FLAG_IS_ICH) &&
  109. (adapter->flags & FLAG_READ_ONLY_NVM))
  110. e1000e_write_protect_nvm_ich8lan(&adapter->hw);
  111. hw->mac.ops.get_bus_info(&adapter->hw);
  112. adapter->hw.phy.autoneg_wait_to_complete = 0;
  113. /* Copper options */
  114. if (adapter->hw.phy.media_type == e1000_media_type_copper) {
  115. adapter->hw.phy.mdix = AUTO_ALL_MODES;
  116. adapter->hw.phy.disable_polarity_correction = 0;
  117. adapter->hw.phy.ms_type = e1000_ms_hw_default;
  118. }
  119. if (e1000_check_reset_block(&adapter->hw))
  120. e_info("PHY reset is blocked due to SOL/IDER session.\n");
  121. netdev->features = NETIF_F_SG |
  122. NETIF_F_HW_CSUM |
  123. NETIF_F_HW_VLAN_TX |
  124. NETIF_F_HW_VLAN_RX;
  125. if (adapter->flags & FLAG_HAS_HW_VLAN_FILTER)
  126. netdev->features |= NETIF_F_HW_VLAN_FILTER;
  127. netdev->features |= NETIF_F_TSO;
  128. netdev->features |= NETIF_F_TSO6;
  129. netdev->vlan_features |= NETIF_F_TSO;
  130. netdev->vlan_features |= NETIF_F_TSO6;
  131. netdev->vlan_features |= NETIF_F_HW_CSUM;
  132. netdev->vlan_features |= NETIF_F_SG;
  133. if (pci_using_dac)
  134. netdev->features |= NETIF_F_HIGHDMA;
  135. if (e1000e_enable_mng_pass_thru(&adapter->hw))
  136. adapter->flags |= FLAG_MNG_PT_ENABLED;
  137. /*
  138. * before reading the NVM, reset the controller to
  139. * put the device in a known good starting state
  140. */
  141. adapter->hw.mac.ops.reset_hw(&adapter->hw);
  142. /*
  143. * systems with ASPM and others may see the checksum fail on the first
  144. * attempt. Let's give it a few tries
  145. */
  146. for (i = 0;; i++) {
  147. if (e1000_validate_nvm_checksum(&adapter->hw) >= 0)
  148. break;
  149. if (i == 2) {
  150. e_err("The NVM Checksum Is Not Valid\n");
  151. err = -EIO;
  152. goto err_eeprom;
  153. }
  154. }
  155. e1000_eeprom_checks(adapter);
  156. /* copy the MAC address out of the NVM */
  157. if (e1000e_read_mac_addr(&adapter->hw))
  158. e_err("NVM Read Error while reading MAC address\n");
  159. memcpy(netdev->dev_addr, adapter->hw.mac.addr, netdev->addr_len);
  160. memcpy(netdev->perm_addr, adapter->hw.mac.addr, netdev->addr_len);
  161. if (!is_valid_ether_addr(netdev->perm_addr)) {
  162. e_err("Invalid MAC Address: %pM\n", netdev->perm_addr);
  163. err = -EIO;
  164. goto err_eeprom;
  165. }
  166. init_timer(&adapter->watchdog_timer);
  167. adapter->watchdog_timer.function = &e1000_watchdog;
  168. adapter->watchdog_timer.data = (unsigned long) adapter;
  169. init_timer(&adapter->phy_info_timer);
  170. adapter->phy_info_timer.function = &e1000_update_phy_info;
  171. adapter->phy_info_timer.data = (unsigned long) adapter;
  172. INIT_WORK(&adapter->reset_task, e1000_reset_task);
  173. INIT_WORK(&adapter->watchdog_task, e1000_watchdog_task);
  174. INIT_WORK(&adapter->downshift_task, e1000e_downshift_workaround);
  175. INIT_WORK(&adapter->update_phy_task, e1000e_update_phy_task);
  176. /* Initialize link parameters. User can change them with ethtool */
  177. adapter->hw.mac.autoneg = 1;
  178. adapter->fc_autoneg = 1;
  179. adapter->hw.fc.requested_mode = e1000_fc_default;
  180. adapter->hw.fc.current_mode = e1000_fc_default;
  181. adapter->hw.phy.autoneg_advertised = 0x2f;
  182. 这里是默认的接收环和发送环大小是256,其实一次中断,能接受的数据不会有太高,我做实验的时候也就是12个。这里的环不是一直存放skb_buff,而是DMA一次中断后能给内核的数据存放地,当中断结束后,skb_buff会被转移的。
  183. /* ring size defaults */
  184. adapter->rx_ring->count = 256;
  185. adapter->tx_ring->count = 256;
  186. /*
  187. * Initial Wake on LAN setting - If APM wake is enabled in
  188. * the EEPROM, enable the ACPI Magic Packet filter
  189. */
  190. if (adapter->flags & FLAG_APME_IN_WUC) {
  191. /* APME bit in EEPROM is mapped to WUC.APME */
  192. eeprom_data = er32(WUC);
  193. eeprom_apme_mask = E1000_WUC_APME;
  194. if (eeprom_data & E1000_WUC_PHY_WAKE)
  195. adapter->flags2 |= FLAG2_HAS_PHY_WAKEUP;
  196. } else if (adapter->flags & FLAG_APME_IN_CTRL3) {
  197. if (adapter->flags & FLAG_APME_CHECK_PORT_B &&
  198. (adapter->hw.bus.func == 1))
  199. e1000_read_nvm(&adapter->hw,
  200. NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data);
  201. else
  202. e1000_read_nvm(&adapter->hw,
  203. NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data);
  204. }
  205. /* fetch WoL from EEPROM */
  206. if (eeprom_data & eeprom_apme_mask)
  207. adapter->eeprom_wol |= E1000_WUFC_MAG;
  208. /*
  209. * now that we have the eeprom settings, apply the special cases
  210. * where the eeprom may be wrong or the board simply won't support
  211. * wake on lan on a particular port
  212. */
  213. if (!(adapter->flags & FLAG_HAS_WOL))
  214. adapter->eeprom_wol = 0;
  215. /* initialize the wol settings based on the eeprom settings */
  216. adapter->wol = adapter->eeprom_wol;
  217. device_set_wakeup_enable(&adapter->pdev->dev, adapter->wol);
  218. /* save off EEPROM version number */
  219. e1000_read_nvm(&adapter->hw, 5, 1, &adapter->eeprom_vers);
  220. /* reset the hardware with the new settings */
  221. e1000e_reset(adapter);
  222. /*
  223. * If the controller has AMT, do not set DRV_LOAD until the interface
  224. * is up. For all other cases, let the f/w know that the h/w is now
  225. * under the control of the driver.
  226. */
  227. if (!(adapter->flags & FLAG_HAS_AMT))
  228. e1000_get_hw_control(adapter);
  229. strcpy(netdev->name, "eth%d");注册网卡驱动
  230. err = register_netdev(netdev);
  231. if (err)
  232. goto err_register;
  233. /* carrier off reporting is important to ethtool even BEFORE open */
  234. netif_carrier_off(netdev);
  235. e1000_print_device_info(adapter);
  236. return 0;
  237. err_register:
  238. if (!(adapter->flags & FLAG_HAS_AMT))
  239. e1000_release_hw_control(adapter);
  240. err_eeprom:
  241. if (!e1000_check_reset_block(&adapter->hw))
  242. e1000_phy_hw_reset(&adapter->hw);
  243. err_hw_init:
  244. kfree(adapter->tx_ring);
  245. kfree(adapter->rx_ring);
  246. err_sw_init:
  247. if (adapter->hw.flash_address)
  248. iounmap(adapter->hw.flash_address);
  249. e1000e_reset_interrupt_capability(adapter);
  250. err_flashmap:
  251. iounmap(adapter->hw.hw_addr);
  252. err_ioremap:
  253. free_netdev(netdev);
  254. err_alloc_etherdev:
  255. pci_release_selected_regions(pdev,
  256. pci_select_bars(pdev, IORESOURCE_MEM));
  257. err_pci_reg:
  258. err_dma:
  259. pci_disable_device(pdev);
  260. return err;
  261. }

  通过上面的函数,我们完成了驱动的初始化和设备注册工作。下面是网卡设备注册的操作函数

  1. static const struct net_device_ops e1000e_netdev_ops = {
  2. .ndo_open = e1000_open,
  3. .ndo_stop = e1000_close,
  4. .ndo_start_xmit = e1000_xmit_frame,
  5. .ndo_get_stats = e1000_get_stats,
  6. .ndo_set_multicast_list = e1000_set_multi,
  7. .ndo_set_mac_address = e1000_set_mac,
  8. .ndo_change_mtu = e1000_change_mtu,
  9. .ndo_do_ioctl = e1000_ioctl,
  10. .ndo_tx_timeout = e1000_tx_timeout,
  11. .ndo_validate_addr = eth_validate_addr,
  12. .ndo_vlan_rx_register = e1000_vlan_rx_register,
  13. .ndo_vlan_rx_add_vid = e1000_vlan_rx_add_vid,
  14. .ndo_vlan_rx_kill_vid = e1000_vlan_rx_kill_vid,
  15. #ifdef CONFIG_NET_POLL_CONTROLLER
  16. .ndo_poll_controller = e1000_netpoll,
  17. #endif
  18. };

  这里关注一下最后一个函数

  1. static void e1000_netpoll(struct net_device *netdev)
  2. {
  3. struct e1000_adapter *adapter = netdev_priv(netdev);
  4. disable_irq(adapter->pdev->irq);这里关闭容器设备中断
  5. e1000_intr(adapter->pdev->irq, netdev); 初始化设备中断
  6. enable_irq(adapter->pdev->irq);
  7. }

  这是网卡驱动的中断处理函数,也就是后半段的处理

  1. static irqreturn_t e1000_intr(int irq, void *data)
  2. {
  3. struct net_device *netdev = data;
  4. struct e1000_adapter *adapter = netdev_priv(netdev);
  5. struct e1000_hw *hw = &adapter->hw;
  6. u32 rctl, icr = er32(ICR);
  7. if (!icr)
  8. return IRQ_NONE; /* Not our interrupt */
  9. /*
  10. * IMS will not auto-mask if INT_ASSERTED is not set, and if it is
  11. * not set, then the adapter didn't send an interrupt
  12. */
  13. if (!(icr & E1000_ICR_INT_ASSERTED))
  14. return IRQ_NONE;
  15. /*
  16. * Interrupt Auto-Mask...upon reading ICR,
  17. * interrupts are masked. No need for the
  18. * IMC write
  19. */
  20. if (icr & E1000_ICR_LSC) {
  21. hw->mac.get_link_status = 1;
  22. /*
  23. * ICH8 workaround-- Call gig speed drop workaround on cable
  24. * disconnect (LSC) before accessing any PHY registers
  25. */
  26. if ((adapter->flags & FLAG_LSC_GIG_SPEED_DROP) &&
  27. (!(er32(STATUS) & E1000_STATUS_LU)))
  28. schedule_work(&adapter->downshift_task);
  29. /*
  30. * 80003ES2LAN workaround--
  31. * For packet buffer work-around on link down event;
  32. * disable receives here in the ISR and
  33. * reset adapter in watchdog
  34. */
  35. if (netif_carrier_ok(netdev) &&
  36. (adapter->flags & FLAG_RX_NEEDS_RESTART)) {
  37. /* disable receives */
  38. rctl = er32(RCTL);
  39. ew32(RCTL, rctl & ~E1000_RCTL_EN);
  40. adapter->flags |= FLAG_RX_RESTART_NOW;
  41. }
  42. /* guard against interrupt when we're going down */
  43. if (!test_bit(__E1000_DOWN, &adapter->state))
  44. mod_timer(&adapter->watchdog_timer, jiffies + 1);
  45. }
  46. 这里调用了_napi_schedule完成将设备的napi队列挂到CPU
  47. if (napi_schedule_prep(&adapter->napi)) {
  48. adapter->total_tx_bytes = 0;
  49. adapter->total_tx_packets = 0;
  50. adapter->total_rx_bytes = 0;
  51. adapter->total_rx_packets = 0;
  52. __napi_schedule(&adapter->napi);
  53. }
  54. return IRQ_HANDLED;
  55. }

  

  1. void __napi_schedule(struct napi_struct *n)
  2. {
  3. unsigned long flags;
  4. local_irq_save(flags);
  5. list_add_tail(&n->poll_list, &__get_cpu_var(softnet_data).poll_list);//adapter里面的队列地址挂到poll.list中
  6. //设置软中断NET_RX_SOFTIRQ,等待调度其中断处理程序
  7. __raise_softirq_irqoff(NET_RX_SOFTIRQ);
  8. local_irq_restore(flags);
  9. }

  再看一下如何打开网络设备

  1. static int e1000_open(struct net_device *netdev)
  2. {
  3. struct e1000_adapter *adapter = netdev_priv(netdev);
  4. struct e1000_hw *hw = &adapter->hw;
  5. int err;
  6. /* disallow open during test */
  7. if (test_bit(__E1000_TESTING, &adapter->state))
  8. return -EBUSY;
  9. netif_carrier_off(netdev);
  10. 初始化传输和接收描述符,这里主要是对接收环和发送环进行初始化,他们需要256个单元空间
  11. /* allocate transmit descriptors */
  12. err = e1000e_setup_tx_resources(adapter);
  13. if (err)
  14. goto err_setup_tx;
  15. /* allocate receive descriptors */
  16. err = e1000e_setup_rx_resources(adapter);
  17. if (err)
  18. goto err_setup_rx;
  19. e1000e_power_up_phy(adapter);
  20. adapter->mng_vlan_id = E1000_MNG_VLAN_NONE;
  21. if ((adapter->hw.mng_cookie.status &
  22. E1000_MNG_DHCP_COOKIE_STATUS_VLAN))
  23. e1000_update_mng_vlan(adapter);
  24. /*
  25. * If AMT is enabled, let the firmware know that the network
  26. * interface is now open
  27. */
  28. if (adapter->flags & FLAG_HAS_AMT)
  29. e1000_get_hw_control(adapter);
  30. /*
  31. * before we allocate an interrupt, we must be ready to handle it.
  32. * Setting DEBUG_SHIRQ in the kernel makes it fire an interrupt
  33. * as soon as we call pci_request_irq, so we have to setup our
  34. * clean_rx handler before we do so.
  35. */这个函数比较重要,在这里面完成对容器的配置,包括软中断设置
  36. e1000_configure(adapter);
  37. {
  1. static void e1000_configure(struct e1000_adapter *adapter)
  2. {
  3. e1000_set_multi(adapter->netdev);
  4. e1000_restore_vlan(adapter);
  5. e1000_init_manageability(adapter);
  6. e1000_configure_tx(adapter);配置发送
  7. e1000_setup_rctl(adapter);
  8. e1000_configure_rx(adapter);配置接收
  9. adapter->alloc_rx_buf(adapter, e1000_desc_unused(adapter->rx_ring));
  10. }
  1. }
  2. err = e1000_request_irq(adapter);
  3. if (err)
  4. goto err_req_irq;
  5. /*
  6. * Work around PCIe errata with MSI interrupts causing some chipsets to
  7. * ignore e1000e MSI messages, which means we need to test our MSI
  8. * interrupt now
  9. */
  10. if (adapter->int_mode != E1000E_INT_MODE_LEGACY) {
  11. err = e1000_test_msi(adapter);
  12. if (err) {
  13. e_err("Interrupt allocation failed\n");
  14. goto err_req_irq;
  15. }
  16. }
  17. /* From here on the code is the same as e1000e_up() */
  18. clear_bit(__E1000_DOWN, &adapter->state);
  19. napi_enable(&adapter->napi);
  20. e1000_irq_enable(adapter);
  21. netif_start_queue(netdev);
  22. /* fire a link status change interrupt to start the watchdog */
  23. ew32(ICS, E1000_ICS_LSC);
  24. return 0;
  25. err_req_irq:
  26. e1000_release_hw_control(adapter);
  27. e1000_power_down_phy(adapter);
  28. e1000e_free_rx_resources(adapter);
  29. err_setup_rx:
  30. e1000e_free_tx_resources(adapter);
  31. err_setup_tx:
  32. e1000e_reset(adapter);
  33. return err;

  这里看一下接收容器中断设置

  1. static void e1000_configure_rx(struct e1000_adapter *adapter)
  2. {
  3. struct e1000_hw *hw = &adapter->hw;
  4. struct e1000_ring *rx_ring = adapter->rx_ring;
  5. u64 rdba;
  6. u32 rdlen, rctl, rxcsum, ctrl_ext;
  7. if (adapter->rx_ps_pages) {
  8. /* this is a 32 byte descriptor */
  9. rdlen = rx_ring->count *
  10. sizeof(union e1000_rx_desc_packet_split);
  11. adapter->clean_rx = e1000_clean_rx_irq_ps;
  12. adapter->alloc_rx_buf = e1000_alloc_rx_buffers_ps;
  13. } else if (adapter->netdev->mtu > ETH_FRAME_LEN + ETH_FCS_LEN) {
  14. rdlen = rx_ring->count * sizeof(struct e1000_rx_desc);
  15. adapter->clean_rx = e1000_clean_jumbo_rx_irq;
  16. adapter->alloc_rx_buf = e1000_alloc_jumbo_rx_buffers;
  17. } else {
  18. rdlen = rx_ring->count * sizeof(struct e1000_rx_desc);
  19. adapter->clean_rx = e1000_clean_rx_irq; 这里的函数是对前半段的一个处理流程,主要是将数据从DMA中获取然后放到队列中,供后半段进行处理。
  20. adapter->alloc_rx_buf = e1000_alloc_rx_buffers;
  21. }
  22. /* disable receives while setting up the descriptors */ //写接收控制寄存器 暂时停止接收
  23. rctl = er32(RCTL);
  24. ew32(RCTL, rctl & ~E1000_RCTL_EN);
  25. e1e_flush();
  26. msleep(10);
  27. /* set the Receive Delay Timer Register *///设置RDTR寄存器 有关
  28. ew32(RDTR, adapter->rx_int_delay);
  29. /* irq moderation */ //设置RADV寄存器 有关RADV具体详见开发者手册
  30. ew32(RADV, adapter->rx_abs_int_delay);
  31. if (adapter->itr_setting != 0)
  32. ew32(ITR, 1000000000 / (adapter->itr * 256));
  33. ctrl_ext = er32(CTRL_EXT);
  34. /* Reset delay timers after every interrupt */
  35. ctrl_ext |= E1000_CTRL_EXT_INT_TIMER_CLR;
  36. /* Auto-Mask interrupts upon ICR access */
  37. ctrl_ext |= E1000_CTRL_EXT_IAME;
  38. ew32(IAM, 0xffffffff);
  39. ew32(CTRL_EXT, ctrl_ext);
  40. e1e_flush();
  41. /*
  42. * Setup the HW Rx Head and Tail Descriptor Pointers and
  43. * the Base and Length of the Rx Descriptor Ring
  44. */
  45. //与接收描述符环有关的有4个寄存器:RDBA存放描述符缓冲的首地址 做为基地址 供64位 包括各32位的高低地址
  46. //RDLEN:为缓冲区分配的总空间的大小 RDH和RDT是头尾指针 存放相对基址的偏移量 RDH的值由硬件增加 表示指向下一次DMA将用的描述符
  47. //RDT由软件增加 表示下一次要处理并送交协议栈的有关描述符
  48. rdba = rx_ring->dma;
  49. ew32(RDBAL, (rdba & DMA_BIT_MASK(32)));
  50. ew32(RDBAH, (rdba >> 32));
  51. ew32(RDLEN, rdlen);
  52. ew32(RDH, 0);
  53. ew32(RDT, 0);
  54. rx_ring->head = E1000_RDH;
  55. rx_ring->tail = E1000_RDT;
  56. /* Enable Receive Checksum Offload for TCP and UDP */
  57. rxcsum = er32(RXCSUM);
  58. if (adapter->flags & FLAG_RX_CSUM_ENABLED) {
  59. rxcsum |= E1000_RXCSUM_TUOFL;
  60. /*
  61. * IPv4 payload checksum for UDP fragments must be
  62. * used in conjunction with packet-split.
  63. */
  64. if (adapter->rx_ps_pages)
  65. rxcsum |= E1000_RXCSUM_IPPCSE;
  66. } else {
  67. rxcsum &= ~E1000_RXCSUM_TUOFL;
  68. /* no need to clear IPPCSE as it defaults to 0 */
  69. }
  70. ew32(RXCSUM, rxcsum);
  71. /*
  72. * Enable early receives on supported devices, only takes effect when
  73. * packet size is equal or larger than the specified value (in 8 byte
  74. * units), e.g. using jumbo frames when setting to E1000_ERT_2048
  75. */
  76. if ((adapter->flags & FLAG_HAS_ERT) &&
  77. (adapter->netdev->mtu > ETH_DATA_LEN)) {
  78. u32 rxdctl = er32(RXDCTL(0));
  79. ew32(RXDCTL(0), rxdctl | 0x3);
  80. ew32(ERT, E1000_ERT_2048 | (1 << 13));
  81. /*
  82. * With jumbo frames and early-receive enabled, excessive
  83. * C4->C2 latencies result in dropped transactions.
  84. */
  85. pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
  86. e1000e_driver_name, 55);
  87. } else {
  88. pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
  89. e1000e_driver_name,
  90. PM_QOS_DEFAULT_VALUE);
  91. }
  92. /* Enable Receives */
  93. ew32(RCTL, rctl);
  94. }

  

转载于:https://www.cnblogs.com/gogly/archive/2012/06/10/2541573.html

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/158218?site
推荐阅读
相关标签
  

闽ICP备14008679号