Try   HackMD

Linux 核心專題: vwifi

執行人: willwillhi1

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
提問清單

  • ?

任務簡述

整理 2022 年報告,解說 Linux 核心的 cfg80211 無線網路框架vwifi 運作原理,並善用 namespace 準備無線網路測試環境,預計完成以下:

TODO: 解說 Linux 核心的 cfg80211 無線網路框架vwifi 運作原理

以中文描述為主,應當涵蓋 vwifi 背景知識 和相關的 cfg80211 無線網路框架

參考資訊:

vwifi 的背景知識

vwifi 程式碼導讀

struct owl_context: 表示最基本的結構體,整個 driver 只會有一個

struct owl_context {
    /* We may not need this lock, cause vif_list would not change during
     * the whole lifetime.
     */
    struct mutex lock;
    /* Indicate the program state */
    enum vwifi_state state;
    /* List for maintaining all interfaces */
    struct list_head vif_list;
    /* List for maintaining multiple AP */
    struct list_head ap_list;
};

Dung-Ru Tsai 可以解釋一下 struct owl_contextstruct owl_vif 的關係嗎?

willwillhi1 就我的理解,owl_context 是 vwifi 用來管理所有的 virtual interface(owl_vif) 的結構體。
owl_context 裡的 vif_list 管理所有的 virtual interface, ap_list 管理所有 AP mode 的 virtual interface。
owl_context 也管理整個程式的 context,所以會有 vwifi_state 來代表目前 vwifi 的狀態。
而 owl_vif 就是一個用來表示 virtual interface 的結構體,目前 vwifi 支援的 mode 有 AP 和 STA。

struct owl_vif: net_device 的 private data 用來表示 Virtual interface,不論是 STA 或 AP 或 Ad-hoc 都是用同一個(主要是因為裡面用 union 來區分)
另外可以從下方知道 STA mode 主要有四個 MLME 邏輯的實作:

struct work_struct ws_connect, ws_disconnect;
struct work_struct ws_scan, ws_scan_timeout;

全部實作完後這些函式的位址會被填入 cfg80211_ops 結構體中

static struct cfg80211_ops owl_cfg_ops = {
    .change_virtual_intf = owl_change_iface,
    .scan = owl_scan,
    .connect = owl_connect,
    .disconnect = owl_disconnect,
    .get_station = owl_get_station,
    .start_ap = owl_start_ap,
    .stop_ap = owl_stop_ap,
};

MLME 函式實作

change_virtual_intf

用來改變 virtual interface 的 type
參考 brcmf_cfg80211_change_iface
這邊只改變 virtual interface 的 iftype

switch (type) {
case NL80211_IFTYPE_STATION:
case NL80211_IFTYPE_AP:
    ndev->ieee80211_ptr->iftype = type;
    break;
default:
    pr_info("owl: invalid interface type %u\n", type);
    return -EINVAL;
}
scan

當 user 做 scan 時,kernel 會執行 owl_scan
首先透過 wdev_get_owl_vif 取得 virtual interface owl_vif *vif,然後取得 vif->lock 後,做 request 和 IFTYPE 的檢查後就把 request 填入 interface。

vif->scan_request = request 

最後把鎖解開後就執行 schedule_work 把 scan 這項任務加入 CFS。
真正在做 scan 的函式是 owl_scan_routine
在 vwifi 裡不會真正去掃描,只會做 timeout 的檢查。

mod_timer(&vif->scan_timeout, jiffies + msecs_to_jiffies(SCAN_TIMEOUT_MS));
timeout

scan 時如果 timeout 就會呼叫 owl_scan_timeout,然後就會把指定的工作加入排程,owl_scan_timeout_work

timeout 裡面執行 cfg80211_scan_done,所以一定會觸發 timeout?

scan 會在 timeout 倒數結束後開始執行

inform_bss 用來通知 kernel 掃描的結果
主要是用 cfg80211_inform_bss_data 做這件事
其中的參數有 cfg80211_inform_bss data,表示頻道、頻寬、訊號

struct cfg80211_inform_bss data = {
    /* the only channel */
    .chan = &ap->wdev.wiphy->bands[NL80211_BAND_2GHZ]->channels[0],
    .scan_width = NL80211_BSS_CHAN_WIDTH_20,
    .signal = DBM_TO_MBM(rand_int_smooth(-100, -30, jiffies)),
};

CFG80211_BSS_FTYPE_UNKNOWN: driver doesn't know whether the data is from a beacon or probe response。

infornation element(ie): 用來放入 frame 的欄位內,表示 scan 掃描到的 AP。
第一個 byte 表示 ie 類型,第二個 byte 表示後面接著的資料長度。

u8 *ie = kmalloc(ap->ssid_len + 2, GFP_KERNEL);
ie[0] = WLAN_EID_SSID;
ie[1] = ap->ssid_len;
memcpy(ie + 2, ap->ssid, ap->ssid_len);

cfg80211_put_bss 用來減少 cfg80211_inform_bss_data 回傳的 cfg80211_bss 結構體的 refcounter,用來決定要不要 free。
回到 owl_scan_timeout_work 最後會用 cfg80211_scan_done 通知 kernel 掃描已經結束。

if (mutex_lock_interruptible(&vif->lock))
    return;

/* finish scan */
cfg80211_scan_done(vif->scan_request, &info);

vif->scan_request = NULL;

mutex_unlock(&vif->lock);
connect

由 kernel 執行 owl_connect 開始,一樣先檢查 STA 是否已連線或目前的 IFTYPE 是否為 STA。
接著把 req_ssid 改為想要連線的 SSID

vif->sme_state = SME_CONNECTING;
vif->ssid_len = sme->ssid_len;
memcpy(vif->req_ssid, sme->ssid, sme->ssid_len);

這邊感覺 req_ssid 連完要改回來,因為已經是已連線的狀態了,req_ssid 比較像是正在請求連線的狀態。

然後把 owl_connect_routine 加入排程

owl_connect_routine
vif 代表自己(要求連線方)的 vitrual interface。
ap 代表被連線方的 vitrual interface。

struct owl_vif *vif = container_of(w, struct owl_vif, ws_connect);
struct owl_vif *ap = NULL;

接下來用 list_for_each_entry 走訪整個 ap_list,如果 ap->ssidvif->req_ssid 一樣就開始連線。
vwifi 實作連線的部分只用 cfg80211_connect_result 來回傳連線成功。
然後修改 vif 的各個設定:

memcpy(vif->ssid, ap->ssid, ap->ssid_len);
memcpy(vif->bssid, ap->bssid, ETH_ALEN);
vif->sme_state = SME_CONNECTED;
vif->conn_time = jiffies;
vif->ap = ap;

接著是將 vif 加入 bss_list 的尾端 (AP 是 head)。

list_add_tail(&vif->bss_list, &ap->bss_list);
disconnect

kernel 呼叫 owl_disconnect 來執行 disconnect,前面得判斷邏輯與 connect 相同,不過 disconnect 還需要加入 reason code

vif->disconnect_reason_code = reason_code;

接下來看到 owl_disconnect_routine
在呼叫 cfg80211_disconnected 之後,driver 會進入 idle 狀態並且不會嘗試去連線其他 AP。

After it calls this function, the driver should enter an idle state and not try to connect to any AP any more.

cfg80211_disconnected(vif->ndev, vif->disconnect_reason_code, NULL, 0, true,
                      GFP_KERNEL);

接著更新一些參數

vif->disconnect_reason_code = 0;
vif->sme_state = SME_DISCONNECTED;

把這個 vif(節點) 從 bss_list 移除,表示這個節點已經沒有與這個 AP(bss_list head) 連線了。

list_del(&vif->bss_list);

最後把這個 vif 的 ap 清空即可

vif->ap = NULL;

以上是 STA 的 MLME 的實作
接著看 AP 的 MLME 的實作

get_station

BIT_ULL 定義在 include/linux/bitops.h,代表將 nr 向左位移 1 位然後轉型為 unsigned long long。

#define BIT_ULL(nr)		(1ULL << (nr))

get_station 主要將資訊填入 station_info

struct station_info *sinfo

如果有填就把 sinfo->filled 的對應的 bit 改成 1,vwifi 這邊有填入的值有

sinfo->filled = BIT_ULL(NL80211_STA_INFO_TX_PACKETS) |
                BIT_ULL(NL80211_STA_INFO_RX_PACKETS) |
                BIT_ULL(NL80211_STA_INFO_TX_FAILED) |
                BIT_ULL(NL80211_STA_INFO_TX_BYTES) |
                BIT_ULL(NL80211_STA_INFO_RX_BYTES) |
                BIT_ULL(NL80211_STA_INFO_SIGNAL) |
                BIT_ULL(NL80211_STA_INFO_INACTIVE_TIME);

最後再依據目前是否已連線來設定 CONNECTED_TIME

if (vif->sme_state == SME_CONNECTED) {
    sinfo->filled |= BIT_ULL(NL80211_STA_INFO_CONNECTED_TIME);
    sinfo->connected_time =
        jiffies_to_msecs(jiffies - vif->conn_time) / 1000;
}
start_ap

需要開始 AP mode 時由 kernel 執行 owl_start_ap。
首先印出 struct cfg80211_ap_settings *settings 各個參數。

pr_info("owl: %s start acting in AP mode.\n", ndev->name);
pr_info("ctrlchn=%d, center=%d, bw=%d, beacon_interval=%d, dtim_period=%d,",
        settings->chandef.chan->hw_value, settings->chandef.center_freq1,
        settings->chandef.width, settings->beacon_interval,
        settings->dtim_period);
pr_info("ssid=%s(%zu), auth_type=%d, inactivity_timeout=%d", settings->ssid,
        settings->ssid_len, settings->auth_type,
        settings->inactivity_timeout);

然後 vwifi 目前 start_up 的就只是簡單的設定 AP 的 SSID 和 BSSID。

vif->ssid_len = settings->ssid_len;
memcpy(vif->ssid, settings->ssid, settings->ssid_len);
memcpy(vif->bssid, vif->ndev->dev_addr, ETH_ALEN);

然後依照定義,將這個 AP 設為 bss_list 的 head

/* AP is the head of vif->bss_list */
INIT_LIST_HEAD(&vif->bss_list);

最後再把這個 AP 加入 ap_list

/* Add AP to global ap_list */
list_add_tail(&vif->ap_list, &owl->ap_list);
stop_ap

stop_ap 的操作其實與 start_ap 對應。
因為是停掉一個 AP,所以對其連線的 STA 都要從 bss_list 刪掉。

不需要其他斷開連線的操作(比如通知連線的 STA, 變更 STA 的 sme_state, etc),是因為不需要嗎

list_for_each_entry_safe (pos, safe, &vif->bss_list, bss_list)
    list_del(&pos->bss_list);

最後將這個 AP 從 ap_list 移除

list_del(&vif->ap_list);
change_virtual_intf

不知道 owl_vif 是怎麼改的,因為初始化的時候都是固定 STA mode。

透過 hostapd

change the interface type

static int owl_change_iface(struct wiphy *wiphy,
                            struct net_device *ndev,
                            enum nl80211_iftype type,
                            struct vif_params *params)
{
    switch (type) {
    case NL80211_IFTYPE_STATION:
    case NL80211_IFTYPE_AP:
        ndev->ieee80211_ptr->iftype = type;
        break;
    default:
        pr_info("owl: invalid interface type %u\n", type);
        return -EINVAL;
    }

    return 0;
}

初始化

vwifi_init

先配置整個 driver 唯一的 context,然後初始化裡面的參數。

owl = kmalloc(sizeof(struct owl_context), GFP_KERNEL);
...
mutex_init(&owl->lock);
INIT_LIST_HEAD(&owl->vif_list);
INIT_LIST_HEAD(&owl->ap_list);

接下來依照 station 數量來建立虛擬網路裝置
首先要先建立 nl80211 上用來描述一個的裝置結構體 wiphy,通過 owl_cfg80211_add 來建立。

owl_cfg80211_add

註解的這部分 which will call cfg80211_ops->add_iface(),我去查 cfg80211_ops 好像沒有這個成員。

該註解已過時,請提交 pull request 修正

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

建立 wiphy 的函式可以用 wiphy_new_nm 或 wiphy_new,差在要不要自己取名。

/* NULL means use the default phy%d naming. */
wiphy = wiphy_new_nm(&owl_cfg_ops, 0, NULL);

設定這個裝置支援的 interface type (AP、STA)。

wiphy->interface_modes =
    BIT(NL80211_IFTYPE_STATION) | BIT(NL80211_IFTYPE_AP);

接著填入這個裝置的支援 band 以及可以 scan 的最大數量。

wiphy->bands[NL80211_BAND_2GHZ] = &nf_band_2ghz;
wiphy->max_scan_ssids = MAX_PROBED_SSIDS;

設定 signal strength value

wiphy->signal_type = CFG80211_SIGNAL_TYPE_MBM;

定義在 include/net/cfg80211.h

/**
 * enum cfg80211_signal_type - signal type
 *
 * @CFG80211_SIGNAL_TYPE_NONE: no signal strength information available
 * @CFG80211_SIGNAL_TYPE_MBM: signal strength in mBm (100*dBm)
 * @CFG80211_SIGNAL_TYPE_UNSPEC: signal strength, increasing from 0 through 100
 */
enum cfg80211_signal_type {
	CFG80211_SIGNAL_TYPE_NONE,
	CFG80211_SIGNAL_TYPE_MBM,
	CFG80211_SIGNAL_TYPE_UNSPEC,
};

最後在 cfg80211 註冊這個 wiphy

if (wiphy_register(wiphy) < 0) {
    pr_info("couldn't register wiphy device\n");
    goto l_error_wiphy_register;
}
owl_interface_add

接下來要替這個裝置建立一個虛擬介面 (virtual interface)。

首先先 allocate 一個網路裝置

ndev = alloc_netdev(sizeof(struct owl_vif), NDEV_NAME, NET_NAME_ENUM,
                    ether_setup);

第一個參數 sizeof(struct owl_vif) 代表 private data 的空間大小,這邊 private data 是指向該裝置的 virtual interface。
然後是基本的網路裝置的初始化函式 ether_setup。

void ether_setup(struct net_device *dev)
{
	dev->header_ops		= &eth_header_ops;
	dev->type		= ARPHRD_ETHER;
	dev->hard_header_len 	= ETH_HLEN;
	dev->min_header_len	= ETH_HLEN;
	dev->mtu		= ETH_DATA_LEN;
	dev->min_mtu		= ETH_MIN_MTU;
	dev->max_mtu		= ETH_DATA_LEN;
	dev->addr_len		= ETH_ALEN;
	dev->tx_queue_len	= DEFAULT_TX_QUEUE_LEN;
	dev->flags		= IFF_BROADCAST|IFF_MULTICAST;
	dev->priv_flags		|= IFF_TX_SKB_SHARING;

	eth_broadcast_addr(dev->broadcast);

}

接下來就是填入 vitrual interface 的資料,
其中有個結構體 wireless_dev 用來表示裝置的無線介面,要把 net_device 的 ieee80211_ptr 指向 wireless_dev 結構體。

the network interface’s ieee80211_ptr pointer to a struct wireless_dev which further describes the wireless part of the interface, normally this struct is embedded in the network interface’s private data area.

/* fill private data of network context. */
vif = ndev_get_owl_vif(ndev);
vif->ndev = ndev;

/* fill wireless_dev context.
 * wireless_dev with net_device can be represented as inherited class of
 * single net_device.
 */
vif->wdev.wiphy = wiphy;
vif->wdev.netdev = ndev;
vif->wdev.iftype = NL80211_IFTYPE_STATION;
vif->ndev->ieee80211_ptr = &vif->wdev;

/* set network device hooks. should implement ndo_start_xmit() at least */
vif->ndev->netdev_ops = &owl_ndev_ops;

/* Add here proper net_device initialization */
vif->ndev->features |= NETIF_F_HW_CSUM;

值得一提的地方是設定 MAC address 地方,用的方法是直接將裝置名稱複製過去,不過要注意的是要避免使用到 multicast 的 MAC address,也就是說整個 MAC address 最高位的 byte 的最低位 bit 不能是 1,所以這邊就直接設定成 0。

這邊的 snprintf 的 size 部分,我覺得應該可以是 ETH_ALEN - 1,因為 intf_name[ETH_ALEN] 已經填 0 了。

提交 pull request

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

/* The first byte is '\0' to avoid being a multicast
 * address (the first byte of multicast addrs is odd).
 */
char intf_name[ETH_ALEN] = {0};
snprintf(intf_name + 1, ETH_ALEN, "%s%d", NAME_PREFIX, if_idx);
memcpy(vif->ndev->dev_addr, intf_name, ETH_ALEN);

最後註冊這個 net_device 即可:

if (register_netdev(vif->ndev))
    goto l_error_ndev_register;

以上可以參考 brcmf_cfg80211_request_sta_if

接著繼續 vif 的初始化

/* Initialize connection information */
memset(vif->bssid, 0, ETH_ALEN);
memset(vif->ssid, 0, IEEE80211_MAX_SSID_LEN);
memset(vif->req_ssid, 0, IEEE80211_MAX_SSID_LEN);
vif->scan_request = NULL;
vif->sme_state = SME_DISCONNECTED;
vif->conn_time = 0;
vif->active_time = 0;
vif->disconnect_reason_code = 0;
vif->ap = NULL;

mutex_init(&vif->lock);

初始化 scan_timeout 計時器

/* Initialize timer of scan_timeout */
timer_setup(&vif->scan_timeout, owl_scan_timeout, 0);

Initialize all of a work item

INIT_WORK(&vif->ws_connect, owl_connect_routine);
INIT_WORK(&vif->ws_disconnect, owl_disconnect_routine);
INIT_WORK(&vif->ws_scan, owl_scan_routine);
INIT_WORK(&vif->ws_scan_timeout, owl_scan_timeout_work);

最後把該 virtual interface 加入 owl 的 vif->list

/* Add vif into global vif_list */
if (mutex_lock_interruptible(&owl->lock))
    goto l_error_add_list;
list_add_tail(&vif->list, &owl->vif_list);
mutex_unlock(&owl->lock);

net_device_ops

This structure defines the management hooks for network devices.
The following hooks can be defined; unless noted otherwise, they are optional and can be filled with a null pointer.

net_device_ops 相當於 network device 的管理函式的集合,除非有定義是必要,不然都是可有可無。

參考 Network Drivers 有講最基本的概念

實作的部分參考 Linux Kernel(16.1)- Network Device Driver, simple snull

ndo_open

This function is called when a network device transitions to the up state.

owl_ndo_open 的定義如下

static int owl_ndo_open(struct net_device *dev)
{
    netif_start_queue(dev);
    return 0;
}

netif_start_queue 用來表示這個裝置允許傳輸

/**
 *	netif_start_queue - allow transmit
 *	@dev: network device
 *
 *	Allow upper layers to call the device hard_start_xmit routine.
 */
static inline void netif_start_queue(struct net_device *dev)
{
	netif_tx_start_queue(netdev_get_tx_queue(dev, 0));
}
owl_ndo_stop

This function is called when a network device transitions to the down state.

這裡做的事比較單純,因為裝置停止傳輸了,所以也要把收到的封包刪除。

list_for_each_entry_safe (pkt, is, &vif->rx_queue, list) {
    list_del(&pkt->list);
    kfree(pkt);
}

最後再停止接收封包

netif_stop_queue - stop transmitted packets
ndo_get_stats

Whenever an application needs to get statistics for the interface, this method is called. This happens, for example, when ifconfig or netstat -i is run.

可以取得 device 的統計資料,通常透過 ifconfig 或 netstat -i 呼叫。

static struct net_device_stats *owl_ndo_get_stats(struct net_device *dev)
{
    struct owl_vif *vif = ndev_get_owl_vif(dev);
    return &vif->stats;
}
ndo_start_xmit

Method that initiates the transmission of a packet. The full packet (protocol headers and all) is contained in a socket buffer (sk_buff) structure.

開始傳送封包,整個封包由 sk_buff 結構體表示。

輸入的參數為 sk_buff *skbnet_device *dev 分別代表要傳送的封包和負責傳送的裝置。

首先先取得傳送跟接收的 virtual interface 和 第二層的標頭

struct owl_vif *vif = ndev_get_owl_vif(dev);
struct owl_vif *dest_vif = NULL;
struct ethhdr *eth_hdr = (struct ethhdr *) skb->data;

接下來會判斷自己是 STA 或 AP,各自會有不同的實作。
這邊的邏輯很簡單,STA 傳輸的對象只可能是目前連線中的 AP(multicast/broadcast 都會先傳給 AP 轉傳),所以就直接傳。

if (vif->wdev.iftype == NL80211_IFTYPE_STATION) {
    if (vif->ap && vif->ap->ap_enabled) {
        dest_vif = vif->ap;

        if (__owl_ndo_start_xmit(vif, dest_vif, skb))
            count++;
    }
}

broadcast:

if (is_broadcast_ether_addr(eth_hdr->h_dest)) {
    list_for_each_entry (dest_vif, &vif->bss_list, bss_list) {
        /* Don't send broadcast packet back
         * to the source interface.
         */
        if (ether_addr_equal(eth_hdr->h_source,
                             dest_vif->ndev->dev_addr))
            continue;

        if (__owl_ndo_start_xmit(vif, dest_vif, skb))
            count++;
    }
}

unicast:

/* The packet is unicasting */
else {
    list_for_each_entry (dest_vif, &vif->bss_list, bss_list) {
        if (ether_addr_equal(eth_hdr->h_dest,
                             dest_vif->ndev->dev_addr)) {
            if (__owl_ndo_start_xmit(vif, dest_vif, skb))
                count++;
            break;
        }
    }
}

最後就是判斷封包是否 drop,以及最後要把封包 free 掉。

if (!count)
    vif->stats.tx_dropped++;

/* Don't forget to cleanup skb, as its ownership moved to xmit callback. */
dev_kfree_skb(skb);
__owl_ndo_start_xmit

這個函式負責把封包的資料取出
一開始先配置 owl_packet 所需空間

pkt = kmalloc(sizeof(struct owl_packet), GFP_KERNEL);

然後從 sk_buff 取資料填入 owl_packet

datalen = skb->len;
memcpy(pkt->data, skb->data, datalen);
pkt->datalen = datalen;

接下來把封包放入目的端的 virtual interface 的 rx_queue

/* enqueue packet to destination vif's rx_queue */
if (mutex_lock_interruptible(&dest_vif->lock))
    goto l_error_before_rx_queue;

list_add_tail(&pkt->list, &dest_vif->rx_queue);

mutex_unlock(&dest_vif->lock);

然後是在來源端的 virtual interface 填入相關資料

if (mutex_lock_interruptible(&vif->lock))
    goto l_erorr_after_rx_queue;

/* Update interface statistics */
vif->stats.tx_packets++;
vif->stats.tx_bytes += datalen;
vif->active_time = jiffies;

mutex_unlock(&vif->lock);

最後還要處理接收封包

/* Directly send to rx_queue, simulate the rx interrupt */
owl_rx(dest_vif->ndev);
owl_rx

處理 rx_queue 收到的封包,將它轉成 skb_buff 的格式。
先處理第一個收到封包

if (mutex_lock_interruptible(&vif->lock))
    goto pkt_free;

pkt = list_first_entry(&vif->rx_queue, struct owl_packet, list);

vif->stats.rx_packets++;
vif->stats.rx_bytes += pkt->datalen;
vif->active_time = jiffies;

mutex_unlock(&vif->lock);

接下來可以先看到這篇關於 On the alignment of IP packets 的說明,可以了解因為 Ethernet Header(14B),導致後面的 IP Header(16B) 無法對齊 four-byte boundary,所以才要多了前面的 2B。

skb_reserve 用來移動 data、tail 指標。
skb_put 往下移動 tail 指標用來增加 data 的空間,並回傳移動前的 tail 指標位置。
最後將該封包從 rx_queue 移除並釋放空間。

/* Put raw packet into socket buffer */
skb = dev_alloc_skb(pkt->datalen + 2);
if (!skb) {
    pr_info("owl rx: low on mem - packet dropped\n");
    vif->stats.rx_dropped++;
    goto pkt_free;
}
skb_reserve(skb, 2); /* align IP on 16B boundary */
memcpy(skb_put(skb, pkt->datalen), pkt->data, pkt->datalen);

list_del(&pkt->list);
kfree(pkt);

AP 處理 multicast/broadcast packet 的部分,會轉傳給除了來源 STA 外的所有 STA。
因為 broadcast 是 multicast 的一個特例,所以可以用 is_multicast_ether_addr 來判斷,若是則將 skb 複製到 skb1。

/* Receiving a multicast/broadcast packet, send it to every
 * STA except the source STA, and pass it to protocol stack.
 */
if (is_multicast_ether_addr(eth_hdr->h_dest)) {
    pr_info("owl: is_multicast_ether_addr\n");
    skb1 = skb_copy(skb, GFP_KERNEL);
}

如果 AP 收到的封包為 unicast,會有兩種可能(給 AP 的、給 BSS 內的某 STA),這邊先處理第二種可能,一樣先把 skb 複製到 skb1,然後把 skb assign 為 NULL。

/* Receiving a unicast packet */
else {
    /* The packet is not for AP itself, send it to destination
     * STA, and do not pass it to procotol stack.
     */
    if (!ether_addr_equal(eth_hdr->h_dest, vif->ndev->dev_addr)) {
        skb1 = skb;
        skb = NULL;
    }
}

接著先處理轉傳的部分(送給其他 STA)

if (skb1) {
    pr_info("owl: AP %s relay:\n", vif->ndev->name);
    owl_ndo_start_xmit(skb1, vif->ndev);
}

/* Nothing to pass to protocol stack */
if (!skb)
    return;

處理第一種可能,這個封包就是給自己(STA 或 AP)的就送給 protocol stack 處理

    /* Pass the skb to protocol stack */
    skb->dev = dev;
    skb->protocol = eth_type_trans(skb, dev);
    skb->ip_summed = CHECKSUM_UNNECESSARY; /* don't check it */
#if LINUX_VERSION_CODE < KERNEL_VERSION(5, 18, 0)
    netif_rx_ni(skb);
#else
    netif_rx(skb);
#endif

    return;

signal

TODO

scripts/command.sh

check_kmod

檢查目前是否已經載入指定的模組,如果有就回傳 0,沒有就回傳 1。

function check_kmod() {
    local mod_name=$1
    lsmod | grep $mod_name > /dev/null
    if [ $? -eq 0 ]; then
        # Module found
        return 0
    fi
    return 1
}

insert_kmod

載入指定模組。
先把 xxx.ko 去除 .ko,xxx 存到 noko_name

local mod_name=$1
local param=$2
local noko_name=$(echo $mod_name |sed s/.ko//)

檢查這個模組是否已經載入,如果是則把它移除

check_kmod $noko_name
ret=$?
if [ $ret -eq 0 ] ; then
    sudo rmmod $noko_name > /dev/null
fi

最後就是載入模組,並回傳 check_kmod 的結果

echo "Installing Module $mod_name"
sudo insmod $mod_name $param
return $(check_kmod $noko_name)

probe_kmod

也是用來載入模組,不過與 insert_kmod 不同的地方在於它會檢查模組的相依性,並載入所需的模組。

remove_kmod

移除模組
用 check_kmod 檢查是否已經載入,若為否就直接退出回傳 0

check_kmod $mod_name
ret=$?
if [ $ret -eq 1 ] ; then
    return 0
fi

接著就移除模組

echo "Removing Module $mod_name"
sudo rmmod $mod_name > /dev/null
return 0

start_hostapd

在 vwifi 中我沒有找到執行 start_hostapd 的地方

由 hostapd 執行

主要就是找到 hostspd 然後執行

which hostapd
ret=$?
if [ $ret -eq 1 ] ; then
    echo "Hostapd is not found"
    return 3
fi
sudo hostapd -B $1 > /dev/null

stop_hostapd

停止 hostapd 這個行程

echo "Stop Hostapd"
sudo kill -9 $(pid of hostapd) > /dev/null

scripts/verify.sh

設定 ROOT 為移動到當前目錄的上一個資料夾然後印出該路徑。
接著引用 common.sh 裡面的函式

export ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
source $ROOT/scripts/common.sh

final_ret 代表最後一次執行的結果,0 代表正確,其他數字代表哪個地方出錯,方便 debug。

final_ret=0

先載入 cfg80211 模組

probe_kmod cfg80211
if [ $? -ne 0 ]; then
    final_ret=1
fi

載入 vwifi 模組

insert_kmod vwifi.ko station=3
if [ $? -ne 0 ]; then
    final_ret=2
fi

檢查 hostapd 有沒有安裝

which hostapd > /dev/null
if [ $? -ne 0 ]; then
    final_ret=3
fi

如果前面都正常執行就可以開始測試
iw dev 的說明

List all network interfaces for wireless hardware.

sudo iw dev

會印出

phy#6
	Interface owl2
		ifindex 8
		wdev 0x600000001
		addr 00:6f:77:6c:32:00
		type managed
phy#5
	Interface owl1
		ifindex 7
		wdev 0x500000001
		addr 00:6f:77:6c:31:00
		type managed
phy#4
	Interface owl0
		ifindex 6
		wdev 0x400000001
		addr 00:6f:77:6c:30:00
		type managed
phy#0
	Interface wlp0s20f3
		ifindex 2
		wdev 0x1
		addr 58:96:1d:48:d9:56
		ssid hpds
		type managed
		channel 6 (2437 MHz), width: 20 MHz, center1: 2437 MHz
		txpower 22.00 dBm
		multicast TXQ:
			qsz-byt	qsz-pkt	flows	drops	marks	overlmt	hashcol	tx-bytes	tx-packets
			0	0	0	0	0	0	0	0		0

接著取出其裝置名稱,把 # 去掉
以上面的輸出為例,owl0_phy 會是 phy#4
然後經過 ${owl0_phy/\#/} 會是 phy4

owl0_phy=$(cat device.log | grep -B 1 owl0 | grep phy)
owl0_phy=${owl0_phy/\#/}
owl1_phy=$(cat device.log | grep -B 1 owl1 | grep phy)
owl1_phy=${owl1_phy/\#/}
owl2_phy=$(cat device.log | grep -B 1 owl2 | grep phy)
owl2_phy=${owl2_phy/\#/}

建立虛擬網路環境分別叫做 ns0, ns1, ns2

# create network namespaces for each phy (interface) 
sudo ip netns add ns0
sudo ip netns add ns1
sudo ip netns add ns2

接著把三個虛擬裝置加入到對應的 namespace

# add each phy (interface) to separate network namesapces
sudo iw phy $owl0_phy set netns name ns0
sudo iw phy $owl1_phy set netns name ns1
sudo iw phy $owl2_phy set netns name ns2

啟用各個 namespace 的 lo, owl 網路裝置,並指定 owl0 為 AP,執行 hostapd

# running hostapd on owl0, so owl0 becomes AP
sudo ip netns exec ns0 ip link set owl0 up
sudo ip netns exec ns0 ip link set lo up
sudo ip netns exec ns0 hostapd -B scripts/hostapd.conf

這邊執行後回傳的結果為

Configuration file: scripts/hostapd.conf
Failed to create interface mon.owl0: -95 (Operation not supported)
owl0: Could not connect to kernel driver
Using interface owl0 with hwaddr 00:6f:77:6c:30:00 and ssid "TestAP"
owl0: interface state UNINITIALIZED->ENABLED
owl0: AP-ENABLED 

不知道是不是正常
TODO:
不是很懂 hostapd 的運作,之後要詳細探討其流程

給虛擬網路裝置(owl0, owl1, owl2) 定義 IP 位址

# assing IP address to each interface
sudo ip netns exec ns0 ip addr add 10.0.0.1/24 dev owl0
sudo ip netns exec ns1 ip addr add 10.0.0.2/24 dev owl1
sudo ip netns exec ns2 ip addr add 10.0.0.3/24 dev owl2

設定都做好後就開始測試了
第一個測試是 owl1 ping owl2,因為未連線所以應該要失敗

PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.
From 10.0.0.2 icmp_seq=1 Destination Host Unreachable

--- 10.0.0.3 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

接著做 scan (sudo ip netns exec ns1 iw dev owl1 scan)
iw dev <devname> scan :

Scan on the given frequencies and probe for the given SSIDs(or wildcard if not given) unless passive scanning is requested.

會得到以下輸出(掃描結果)

BSS 00:6f:77:6c:30:00(on owl1)
	TSF: 303690291 usec (0d, 00:05:03)
	freq: 2412
	beacon interval: 100 TUs
	capability: ESS (0x0001)
	signal: -60.00 dBm
	last seen: 0 ms ago
	SSID: TestAP

做 connect 後進行 link 命令
iw dev <devname> link:

Print information about the current link, if any.

會得到

Connected to 00:6f:77:6c:30:00 (on owl1)
	SSID: TestAP
	freq: 2412
	RX: 0 bytes (0 packets)
	TX: 0 bytes (0 packets)
	signal: -32 dBm

接著會比較兩個命令得到的 MAC address 有沒有相同,確保是連到同一個 AP

DIFF=$(diff connected.log scan_bssid.log)
if [ "$DIFF" != "" ]; then
    final_ret=4
fi

owl2 也要做相同的事(scan, connect, link)
connect 完後就是要測試 STA 之間的連線,也就是 owl1 ping owl2,會得到以下結果

PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.
64 bytes from 10.0.0.3: icmp_seq=1 ttl=64 time=0.168 ms
64 bytes from 10.0.0.3: icmp_seq=2 ttl=64 time=0.136 ms
64 bytes from 10.0.0.3: icmp_seq=3 ttl=64 time=0.144 ms
64 bytes from 10.0.0.3: icmp_seq=4 ttl=64 time=0.126 ms

--- 10.0.0.3 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3077ms
rtt min/avg/max/mdev = 0.126/0.143/0.168/0.015 ms

在讀 verify.sh 時,發現 160 行 ~ 164 行發現錯字

python3 $ROOT/scripts/plot_rssi.py plot_rc=$? if [ $plot_rc -ne 0 ]; then - plot_rc=9 + final_ret=9 fi

修正於 commit dbfd05e

探討 hostapd

這邊因為不清楚 hostapd 的流程,所以去觀察他的執行過程。
以下通過 trave-cmd 來追蹤 hostapd 的執行,然後用 80211 的關鍵字搜尋。

sudo trace-cmd record -p function -l '*80211*' sudo ip netns exec ns0 hostapd -B scripts/hostapd.conf

執行完後用 trace-cmd report 輸出追蹤過程
以下僅列出要觀察的部份
首先是 change interface 的部份 (STA -> AP)

...
hostapd-17315 [006]  4219.815238: function:             nl80211_set_interface
hostapd-17315 [006]  4219.815239: function:                nl80211_parse_mon_options.isra.0
hostapd-17315 [006]  4219.815239: function:                cfg80211_change_iface
hostapd-17315 [006]  4219.815239: function:                   cfg80211_disconnect
hostapd-17315 [006]  4219.815239: function:                   cfg80211_process_wdev_events
hostapd-17315 [006]  4219.815240: function:                   cfg80211_mlme_purge_registrations
hostapd-17315 [006]  4219.815240: function:                      cfg80211_mgmt_registrations_update
hostapd-17315 [006]  4219.815240: function:                   cfg80211_update_iface_num
hostapd-17315 [006]  4219.815240: function:                   cfg80211_update_iface_num
hostapd-17315 [006]  4219.815240: function:                nl80211_notify_iface
hostapd-17315 [006]  4219.815241: function:                   nl80211_send_iface
...

去觀察 cfg80211_change_iface 的程式
可以發現會檢查 driver 有沒有設定 change_virtual_intf 函式

if (!rdev->ops->change_virtual_intf ||
    !(rdev->wiphy.interface_modes & (1 << ntype)))
    return -EOPNOTSUPP;

然後針對不同的 interface 做不同的前製處理,因為 vwifi 只會從 STA -> AP,所以這邊就只看 STA 的部份,可以發現它會先去做 disconnect 的操作。

switch (otype) {
...
case NL80211_IFTYPE_STATION:
case NL80211_IFTYPE_P2P_CLIENT:
    wdev_lock(dev->ieee80211_ptr);
    cfg80211_disconnect(rdev, dev,
                WLAN_REASON_DEAUTH_LEAVING, true);
    wdev_unlock(dev->ieee80211_ptr);
    break;
...
}

接下來是真正做切換的地方

err = rdev_change_virtual_intf(rdev, dev, ntype, params);

跳去 rdev_change_virtual_intf

static inline int
rdev_change_virtual_intf(struct cfg80211_registered_device *rdev,
			 struct net_device *dev, enum nl80211_iftype type,
			 u32 *flags, struct vif_params *params)
{
	int ret;
	trace_rdev_change_virtual_intf(&rdev->wiphy, dev, type);
	ret = rdev->ops->change_virtual_intf(&rdev->wiphy, dev, type, flags,
					     params);
	trace_rdev_return_int(&rdev->wiphy, ret);
	return ret;
}

rdev->ops->change_virtual_intf 這邊的呼叫就是在執行 vwifi 中設定好的函式,vwifi 在這邊只是簡單的切換 iftype 而已。
回到 trace-cmd report 中尋找關於啟動 AP 的部份

hostapd-17315 [006]  4219.815724: function:             nl80211_start_ap
hostapd-17315 [006]  4219.815724: function:                nl80211_parse_beacon
hostapd-17315 [006]  4219.815724: function:                cfg80211_validate_beacon_int
hostapd-17315 [006]  4219.815724: function:                nl80211_crypto_settings
hostapd-17315 [006]  4219.815725: function:                cfg80211_reg_can_beacon_relax
hostapd-17315 [006]  4219.815725: function:                   _cfg80211_reg_can_beacon.constprop.0
hostapd-17315 [006]  4219.815725: function:                      cfg80211_chandef_dfs_required
hostapd-17315 [006]  4219.815725: function:                         cfg80211_chandef_valid
hostapd-17315 [006]  4219.815725: function:                         cfg80211_get_chans_dfs_required
hostapd-17315 [006]  4219.815725: function:                            ieee80211_get_channel_khz
hostapd-17315 [006]  4219.815725: function:                      cfg80211_chandef_usable
hostapd-17315 [006]  4219.815725: function:                         cfg80211_chandef_valid
hostapd-17315 [006]  4219.815726: function:                         cfg80211_secondary_chans_ok
hostapd-17315 [006]  4219.815726: function:                            ieee80211_get_channel_khz

再去查看 nl80211_start_ap 的程式
如果沒有設定 start_ap 函式就會跳錯

if (!rdev->ops->start_ap)
    return -EOPNOTSUPP;

觀察 params

struct cfg80211_ap_settings params;
...
if (info->attrs[NL80211_ATTR_SSID]) {
    params.ssid = nla_data(info->attrs[NL80211_ATTR_SSID]);
    params.ssid_len =
        nla_len(info->attrs[NL80211_ATTR_SSID]);
    if (params.ssid_len == 0)
        return -EINVAL;
}

最後看到真正做 start_ap 的地方,做完之後也要更新 wdev 的值

struct wireless_dev *wdev = dev->ieee80211_ptr;
...
wdev_lock(wdev);
err = rdev_start_ap(rdev, dev, &params);
if (!err) {
    wdev->preset_chandef = params.chandef;
    wdev->beacon_interval = params.beacon_interval;
    wdev->chandef = params.chandef;
    wdev->ssid_len = params.ssid_len;
    memcpy(wdev->ssid, params.ssid, wdev->ssid_len);

    if (info->attrs[NL80211_ATTR_SOCKET_OWNER])
        wdev->conn_owner_nlportid = info->snd_portid;
}
wdev_unlock(wdev);

看到 rdev_start_ap

static inline int rdev_start_ap(struct cfg80211_registered_device *rdev,
				struct net_device *dev,
				struct cfg80211_ap_settings *settings)
{
	int ret;
	trace_rdev_start_ap(&rdev->wiphy, dev, settings);
	ret = rdev->ops->start_ap(&rdev->wiphy, dev, settings);
	trace_rdev_return_int(&rdev->wiphy, ret);
	return ret;
}

可以發現就是在執行rdev->ops->start_ap(&rdev->wiphy, dev, settings),也就是我們在 vwifi 寫好的 start_up 函式。

TODO: 準備多個 SSID/STA 的網路測試環境

比照 Testing a Linux Routing Daemon in a Simulated Environment,準備多個 SSID/STA 的網路測試環境,並改進現有 vwifi 的 CI 流程

把 clean up 函式從 if 移出

原本的程式為

if [ $final_ret -eq 0 ]; then
    stop_hostapd
    remove_kmod vwifi
    sudo ip netns del ns0
    sudo ip netns del ns1
    sudo ip netns del ns2
    rm scan_result.log scan_bssid.log connected.log device.log rssi.txt 
    echo "==== Test PASSED ===="
    exit 0
fi

echo "FAILED (code: $final_ret)"
echo "==== Test FAILED ===="
exit $final_ret

但是我發現如果執行到最後 final_ret 不為 0,就不會執行到 stop_hostapd ~ rm 的命令。
導致如果重新 make check 之前,就需要手動把這些命令打一遍。
所以我就把這些命令從 if 移出,變成

if [ $final_ret -eq 0 ]
then 
    echo "==== Test PASSED ===="
else
    echo "FAILED (code: $final_ret)"
    echo "==== Test FAILED ===="
fi

stop_hostapd
remove_kmod vwifi
sudo ip netns del ns0
sudo ip netns del ns1
sudo ip netns del ns2
sudo ip netns del ns3
sudo ip netns del ns4
sudo ip netns del ns5
rm scan_result.log scan_bssid.log connected.log device.log rssi.txt

exit $final_ret

這麼一來就不用每次都要重新打一遍一長串命令。

commit e2ca477

TODO: 提交 pull request

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

加入第二個 AP 的測試

首先因為是兩個 AP,不能共用同一個 hostapd.conf
所以我分成兩個,hostapd1.conf 和 hostapd2.conf
總共要新增六個虛擬裝置(2 個 AP 和 4 個 STA)

# get phy number of each interface
sudo iw dev > device.log
owl0_phy=$(cat device.log | grep -B 1 owl0 | grep phy)
owl0_phy=${owl0_phy/\#/}
owl1_phy=$(cat device.log | grep -B 1 owl1 | grep phy)
owl1_phy=${owl1_phy/\#/}
owl2_phy=$(cat device.log | grep -B 1 owl2 | grep phy)
owl2_phy=${owl2_phy/\#/}
+owl3_phy=$(cat device.log | grep -B 1 owl3 | grep phy)
+owl3_phy=${owl3_phy/\#/}
+owl4_phy=$(cat device.log | grep -B 1 owl4 | grep phy)
+owl4_phy=${owl4_phy/\#/}
+owl5_phy=$(cat device.log | grep -B 1 owl5 | grep phy)
+owl5_phy=${owl5_phy/\#/}

建立六個 network namespace

# create network namespaces for each phy (interface) 
sudo ip netns add ns0
sudo ip netns add ns1
sudo ip netns add ns2
+sudo ip netns add ns3
+sudo ip netns add ns4
+sudo ip netns add ns5

將虛擬裝置加入各個 network namespace

# add each phy (interface) to separate network namespaces
sudo iw phy $owl0_phy set netns name ns0
sudo iw phy $owl1_phy set netns name ns1
sudo iw phy $owl2_phy set netns name ns2
+sudo iw phy $owl3_phy set netns name ns3
+sudo iw phy $owl4_phy set netns name ns4
+sudo iw phy $owl5_phy set netns name ns5

owl0 和 owl3 用 hostapd 使其成為 AP
然後各自啟用虛擬網路裝置

# running hostapd on owl0 and owl3, so owl0 and owl3 becomes AP
sudo ip netns exec ns0 ip link set owl0 up
sudo ip netns exec ns0 ip link set lo up
sudo ip netns exec ns0 hostapd -B scripts/hostapd1.conf
+sudo ip netns exec ns3 ip link set owl3 up
+sudo ip netns exec ns3 ip link set lo up
+sudo ip netns exec ns3 hostapd -B scripts/hostapd2.conf

sudo ip netns exec ns1 ip link set owl1 up
sudo ip netns exec ns1 ip link set lo up

sudo ip netns exec ns2 ip link set owl2 up
sudo ip netns exec ns2 ip link set lo up

+sudo ip netns exec ns4 ip link set owl4 up
+sudo ip netns exec ns4 ip link set lo up

+sudo ip netns exec ns5 ip link set owl5 up
+sudo ip netns exec ns5 ip link set lo up

新加入的網路裝置
owl3 分配 ip 為 10.0.0.4/24
owl4 分配 ip 為 10.0.0.5/24
owl5 分配 ip 為 10.0.0.6/24

# assing IP address to each interface
sudo ip netns exec ns0 ip addr add 10.0.0.1/24 dev owl0
sudo ip netns exec ns1 ip addr add 10.0.0.2/24 dev owl1
sudo ip netns exec ns2 ip addr add 10.0.0.3/24 dev owl2
+sudo ip netns exec ns3 ip addr add 10.0.0.4/24 dev owl3
+sudo ip netns exec ns4 ip addr add 10.0.0.5/24 dev owl4
+sudo ip netns exec ns5 ip addr add 10.0.0.6/24 dev owl5

之後做的測試與之前相同,這邊就不列出來了。
commit e3d3dc8
commit f2162a2

station dump

想要用 iw station dump 指令列出該 AP 所管理的 STA 時,發現會什麼都印不出來。

$ sudo ip netns exec ns0 iw dev owl0 station dump
$

所以便去看 iw station dump 都做了些什麼
用 trace-cmd 去追蹤,看到 iw 的部份

iw-5563  [001]  3017.903103: function:             nl80211_netlink_notify
iw-5563  [001]  3017.903111: function:             nl80211_netlink_notify
iw-5563  [001]  3017.904452: function:             nl80211_dump_station
iw-5563  [001]  3017.904453: function:                nl80211_prepare_wdev_dump
iw-5563  [001]  3017.904454: function:                   __cfg80211_wdev_from_attrs
iw-5563  [001]  3017.904476: function:             nl80211_netlink_notify
iw-5563  [001]  3017.904476: function:                cfg80211_mlme_unregister_socket
iw-5563  [001]  3017.904478: function:                cfg80211_release_pmsr
iw-5563  [001]  3017.904478: function:                cfg80211_mlme_unregister_socket
iw-5563  [001]  3017.904479: function:                cfg80211_release_pmsr
iw-5563  [001]  3017.904479: function:                cfg80211_mlme_unregister_socket
iw-5563  [001]  3017.904480: function:                cfg80211_release_pmsr
iw-5563  [001]  3017.904480: function:                cfg80211_mlme_unregister_socket
iw-5563  [001]  3017.904484: function:                cfg80211_release_pmsr

特別去看 nl80211_dump_station 的程式碼,然後觀察以下程式的行為。

這邊會先判斷有沒有定義 dump_station 函式
然後進入無限迴圈,sta_idx 會從 0 開始,然後逐一累加直到,得到的 err 為 -ENOENT (全部掃描完)或函式執行有誤(err 不為 0 或 -ENOENT)
如果執行正確,sinfo 就會被填入所求的 STA 資訊,然後透過 nl80211_send_station 函式送給上層的 nl80211。
迴圈最後再把 sta_iidx 累加 1,繼續執行。

static int nl80211_dump_station(struct sk_buff *skb,
				struct netlink_callback *cb)
{
...
if (!dev->ops->dump_station) {
    err = -EOPNOTSUPP;
    goto out_err;
}

while (1) {
    err = dev->ops->dump_station(&dev->wiphy, netdev, sta_idx,
                     mac_addr, &sinfo);
    if (err == -ENOENT)
        break;
    if (err)
        goto out_err;

    if (nl80211_send_station(skb,
            NETLINK_CB(cb->skb).pid,
            cb->nlh->nlmsg_seq, NLM_F_MULTI,
            netdev, mac_addr,
            &sinfo) < 0)
        goto out;

    sta_idx++;
}

之後去觀摩 mac80211 是怎麼實作 station_dump 的
mac80211,從以下可以知道 mac80211 實作 station dump 是叫做 ieee80211_dump_station。

const struct cfg80211_ops mac80211_config_ops = {
    ...
	.get_station = ieee80211_get_station,
	.dump_station = ieee80211_dump_station,

去找 mac80211 的 ieee80211_dump_station
這邊的 IEEE80211_DEV_TO_SUB_IF 的概念就是去找出 dev 的 priv_data(也就是 interface 的結構體)。
sta_info_get_by_idx 會回傳找到的 STA,如果有找到 sta (sta 不為 NULL),先把 ret 改為 0,代表有找到,然後把 sta 資訊填入 sinfo 即可。
最後 ret 回傳的值可以告訴 nl80211_dump_station 要不要繼續找下一個 sta。

static int ieee80211_dump_station(struct wiphy *wiphy, struct net_device *dev,
				  int idx, u8 *mac, struct station_info *sinfo)
{
	struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev);
	struct ieee80211_local *local = sdata->local;
	struct sta_info *sta;
	int ret = -ENOENT;

	mutex_lock(&local->sta_mtx);

	sta = sta_info_get_by_idx(sdata, idx);
	if (sta) {
		ret = 0;
		memcpy(mac, sta->sta.addr, ETH_ALEN);
		sta_set_sinfo(sta, sinfo);
	}

	mutex_unlock(&local->sta_mtx);

	return ret;
}

然後去參考 sta_info_get_by_idx 的實作
可以發現整個邏輯就是用 list_for_each_entry_rcu 去走訪整個雙向鍊結串列,然後以 idx 為中止條件,
如果全部走完就回傳 NULL,代表已經全部傳完了。

struct sta_info *sta_info_get_by_idx(struct ieee80211_sub_if_data *sdata,
				     int idx)
{
	struct ieee80211_local *local = sdata->local;
	struct sta_info *sta;
	int i = 0;

	list_for_each_entry_rcu(sta, &local->sta_list, list) {
		if (sdata != sta->sdata)
			continue;
		if (i < idx) {
			++i;
			continue;
		}
		return sta;
	}

	return NULL;
}

開始對 vwifi 修改
先新增 station_dump 的對應的函式

static struct cfg80211_ops owl_cfg_ops = {
    .change_virtual_intf = owl_change_iface,
    .scan = owl_scan,
    .connect = owl_connect,
    .disconnect = owl_disconnect,
    .get_station = owl_get_station,
+   .dump_station = owl_dump_station,
    .start_ap = owl_start_ap,
    .stop_ap = owl_stop_ap,
};

新增 owl_dump_station 函式,參考 mac80211 的實作
先用 dev 找出 AP 的 owl_vif

struct owl_vif *ap_vif = ndev_get_owl_vif(dev);

然後印出 idx 觀察程式有沒有如預期執行

pr_info("Dump station at the idx %d\n", idx);

初始化需要用到參數

int ret = -ENONET;
struct owl_vif *sta_vif = NULL;
int i = 0;

然後是用迴圈走到第 idx 個 STA

list_for_each_entry (sta_vif, &ap_vif->bss_list, bss_list) {
    if (i < idx) {
        ++i;
        continue;
    }
    break;
}

依據 sta_vif 來修改 ret 的值

if (sta_vif != ap_vif)
    ret = 0;
else
    return ret;

接著把 sta->bssid 複製到 mac 參數

memcpy(mac, sta_vif->ndev->dev_addr, ETH_ALEN);

最後就是參考 vwifi 原本的 owl_get_station 來填入 sinfo 的值,但要注意最後要改為回傳 ret。

sinfo->filled = BIT_ULL(NL80211_STA_INFO_TX_PACKETS) |
                BIT_ULL(NL80211_STA_INFO_RX_PACKETS) |
                BIT_ULL(NL80211_STA_INFO_TX_FAILED) |
                BIT_ULL(NL80211_STA_INFO_TX_BYTES) |
                BIT_ULL(NL80211_STA_INFO_RX_BYTES) |
                BIT_ULL(NL80211_STA_INFO_SIGNAL) |
                BIT_ULL(NL80211_STA_INFO_INACTIVE_TIME);

if (sta_vif->sme_state == SME_CONNECTED) {
    sinfo->filled |= BIT_ULL(NL80211_STA_INFO_CONNECTED_TIME);
    sinfo->connected_time =
        jiffies_to_msecs(jiffies - sta_vif->conn_time) / 1000;
}

sinfo->tx_packets = sta_vif->stats.tx_packets;
sinfo->rx_packets = sta_vif->stats.rx_packets;
sinfo->tx_failed = sta_vif->stats.tx_dropped;
sinfo->tx_bytes = sta_vif->stats.tx_bytes;
sinfo->rx_bytes = sta_vif->stats.rx_bytes;
/* For CFG80211_SIGNAL_TYPE_MBM, value is expressed in dBm */
sinfo->signal = rand_int_smooth(-100, -30, jiffies);
sinfo->inactive_time = jiffies_to_msecs(jiffies - sta_vif->active_time);
/* TODO: Emulate rate and mcs */

return ret; 

執行時可以用 pr_info 來確定程式執行有如預期

[22039.933404] Dump station at the idx 0
[22039.933411] Dump station at the idx 1
[22039.933414] Dump station at the idx 2

最後的執行結果(以 owl0 為 AP,owl1 和 owl2 為連線的 STA)
可以正確的讓 vwifi 正確回傳該 AP 目前連線的 STA 了。

$ sudo ip netns exec ns1 iw dev owl1 scan
BSS 00:6f:77:6c:30:00(on owl1)
	TSF: 76069798753 usec (0d, 21:07:49)
	freq: 2412
	beacon interval: 100 TUs
	capability: ESS (0x0001)
	signal: -61.00 dBm
	last seen: 0 ms ago
	SSID: TestAP1
$ sudo ip netns exec ns2 iw dev owl2 scan
BSS 00:6f:77:6c:30:00(on owl2)
	TSF: 76085022688 usec (0d, 21:08:05)
	freq: 2412
	beacon interval: 100 TUs
	capability: ESS (0x0001)
	signal: -37.00 dBm
	last seen: 0 ms ago
	SSID: TestAP1
$ sudo ip netns exec ns2 iw dev owl2 connect TestAP1
$ sudo ip netns exec ns1 iw dev owl1 connect TestAP1
$ sudo ip netns exec ns0 iw dev owl0 station dump
Station 00:6f:77:6c:31:00 (on owl0)
	inactive time:	3040440 ms
	rx bytes:	0
	rx packets:	0
	tx bytes:	0
	tx packets:	0
	tx failed:	0
	signal:  	-34 dBm
	connected time:	445 seconds
	current time:	1686636638344 ms
Station 00:6f:77:6c:32:00 (on owl0)
	inactive time:	3040440 ms
	rx bytes:	0
	rx packets:	0
	tx bytes:	0
	tx packets:	0
	tx failed:	0
	signal:  	-34 dBm
	connected time:	439 seconds
	current time:	1686636638344 ms

commit d31d733

提交 pull request

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
jserv

實作 passive scan

a STA listens on the Beacon frames that an AP periodically sends in each channel to obtain AP information. A Beacon frame contains information including the SSID and supported rate.
To save power of a STA, enable the STA to passively scan wireless networks.

passive scan 簡單來說就是 AP 會以固定週期傳送 Beacon Frame,而 STA 被動的接收到 Beacon Frame 後可以藉此得知該 AP 的存在。

目前 vwifi 只有實作 active scan,如果想要作到 roam 的效果,考慮到一般的 roam 的條件:
(1) A weak signal quality: The AP’s current RSSI value is weak (below -75 dBm)
(2) Loss of beacons: When beacons from a connected AP are not received after 2 seconds with an active station
(3) Channel utilization (CU): Multiple clients are connected to the same AP. Despite having a strong radio signal, connectivity may be disrupted due to network overload. When this is the case, the AP will notify clients of its current traffic using the CU factor in the beacons. The mobile device will start scanning if the received CU value is greater than 70% and the current RSSI value is between -65 and -75 dBm.

由以上可知必須要有接收 beacon 的功能
所以我就先實作了 beacon 的功能

模擬 beacon 的行為,需要考慮到的地方是 AP 會需要定時的發出 beacon 給 STA。
因此我利用 delayed_work 來實作
先在 owl_vif 的 AP mode 的部份加入 dw_send_beacon

/* Structure for AP mode */
struct {
    bool ap_enabled;
    /* List node for storing AP (owl->ap_list is the head),
     * this field is for interface in AP mode.
     */
    struct list_head ap_list;

+   struct delayed_work dw_send_beacon;
};

接著是觸發的條件,因為 vwifi 在建立一個 virtual interface 時預設都是 STA mode,所以是不需要進行發送 beacon 的動作的,需要進行的時間點是從 STA mode 變為 AP mode 時,也就是執行 owl_start_ap 時,所以我在 owl_start_ap 多加入了:
刪除原本在 STA mode 中的排程工作

/* Make sure that no work is queued */
cancel_work_sync(&vif->ws_connect);
cancel_work_sync(&vif->ws_disconnect);
cancel_work_sync(&vif->ws_scan);
cancel_work_sync(&vif->ws_scan_timeout);

初始化 delay work,以及將對應的工作加入排程,並且預設延遲 10 秒,這個延遲時間代表 beacon interval。
之後考量到一般 real world 的 beacon interval 都是設定為 100 ms,所以我把它改為 0.1 * HZ

INIT_DELAYED_WORK(&vif->dw_send_beacon, owl_send_beacon_work);
schedule_delayed_work(&vif->dw_send_beacon, 0.1 * HZ);

接下來看到 dw_send_beacon 結構體所設定的 func: owl_send_beacon_work 做了什麼事
簡單來說就是執行 beaconing 函式,以及再這之後繼續排定下一個發出 beacon 的任務。

/* Notify the "dummy" BSS to the kernel and schedule the next beacon.
 */
static void owl_send_beacon_work(struct work_struct *w)
{
    struct owl_vif *ap_vif =
        container_of(w, struct owl_vif, dw_send_beacon.work);

    /* inform with dummy BSS */
    beaconing(ap_vif);

    schedule_delayed_work(&ap_vif->dw_send_beacon, 10 * HZ);
}

然後看到 beaconing 函式
這邊大部分的程式都是參考 inform_bss,因為兩者要做的事本質上是不變的,差別只是在於 active 和 passive。
beaconing 函式的 input 為需要發出 beacon 的 AP,接著我們就利用之前建立的雙向鍊結佇列 owl->vif_list 來走訪所有的 STA mode 的 interface。
我們會對走訪到的 STA 發出 beacon,但是這邊有一個重點部份還未實作,也就是未來可以依據模擬的空間來填入訊號強度,目前就只是單純的對所有 STA 發出 beacon。
假如我們可以作到依據模擬空間算出訊號強度,那我們就可以讓 vwifi 做到模擬 beacon loss 的效果,最終實作 roam。

/* Helper functions to prepare structures with custom BSS information
 * and "inform" the kernel about the "new" BSS.
 * The specified AP will visit the vif_list and send a beacon to the visited
 * STA.
 */
static void beaconing(struct owl_vif *ap)
{
    struct owl_vif *vif;
    list_for_each_entry (vif, &owl->vif_list, list) {
        if (vif->wdev.iftype == NL80211_IFTYPE_AP)
            continue;

        struct cfg80211_bss *bss = NULL;

        /* TODO: Simulate space to represent signal strength */

        struct cfg80211_inform_bss data = {
            /* the only channel */
            .chan = &ap->wdev.wiphy->bands[NL80211_BAND_2GHZ]->channels[0],
            .scan_width = NL80211_BSS_CHAN_WIDTH_20,
            .signal = DBM_TO_MBM(rand_int_smooth(-100, -30, jiffies)),
        };

        pr_info("owl: %s receive beacon from %s (SSID: %s, BSSID: %pM)\n",
                vif->ndev->name, ap->ndev->name, ap->ssid, ap->bssid);

        u8 *ie = kmalloc(ap->ssid_len + 2, GFP_KERNEL);
        ie[0] = WLAN_EID_SSID;
        ie[1] = ap->ssid_len;
        memcpy(ie + 2, ap->ssid, ap->ssid_len);

        /* Using the CLOCK_BOOTTIME clock, which remains unaffected by changes
         * in the system time-of-day clock and includes any time that the
         * system is suspended.
         * This clock is suitable for synchronizing the machines in the BSS
         * using tsf.
         */
        u64 tsf = div_u64(ktime_get_boottime_ns(), 1000);

        /* It is posible to use cfg80211_inform_bss() instead. */
        bss = cfg80211_inform_bss_data(
            vif->wdev.wiphy, &data, CFG80211_BSS_FTYPE_BEACON, ap->bssid, tsf,
            WLAN_CAPABILITY_ESS, 10000, ie, ap->ssid_len + 2, GFP_KERNEL);

        /* cfg80211_inform_bss_data() returns cfg80211_bss structure referefence
         * counter of which should be decremented if it is unused.
         */
        cfg80211_put_bss(vif->wdev.wiphy, bss);
        kfree(ie);
    }
}

最後是再移除 vwifi 時運行的 owl_stop_ap 的修改,要取消已經加入排程的 delay work。

static int owl_stop_ap(struct wiphy *wiphy, struct net_device *ndev)
{
    struct owl_vif *vif = ndev_get_owl_vif(ndev);
    struct owl_vif *pos = NULL, *safe = NULL;

    pr_info("owl: %s stop acting in AP mode.\n", ndev->name);

    if (owl->state == OWL_READY) {
        /* Destroy bss_list first */
        list_for_each_entry_safe (pos, safe, &vif->bss_list, bss_list)
            list_del(&pos->bss_list);

        /* Remove ap from global ap_list */
        if (mutex_lock_interruptible(&owl->lock))
            return -ERESTARTSYS;

        list_del(&vif->ap_list);

        mutex_unlock(&owl->lock);
    }

    vif->ap_enabled = false;
+   cancel_delayed_work_sync(&vif->dw_send_beacon);

    return 0;
}

最後用 dmesg 印出執行中的訊息,可以發現 vwifi 確實可以成功的每隔 10 秒就進行一次發出 beacon 的動作。
下面的執行的情境是有一個 AP(owl0) 和兩個 STA(owl1 & owl2)。

$ dmesg | tail
[ 8066.322885] owl: owl0 start acting in AP mode.
[ 8066.322894] ctrlchn=6, center=2437, bw=0, beacon_interval=100, dtim_period=2,
[ 8066.322900] ssid=TestAP(6), auth_type=8, inactivity_timeout=0
[ 8066.322973] IPv6: ADDRCONF(NETDEV_CHANGE): owl0: link becomes ready
[ 8076.496309] owl: owl1 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8076.496336] owl: owl2 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8086.736196] owl: owl1 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8086.736221] owl: owl2 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8096.976140] owl: owl1 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8096.976165] owl: owl2 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8107.216005] owl: owl1 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)
[ 8107.216032] owl: owl2 receive beacon from owl0 (SSID: TestAP, BSSID: 00:6f:77:6c:30:00)

接下來測試另外一個情境,兩個 AP(owl0 & owl1) 和一個 STA(owl2)。

$ dmesg | tail
[ 9178.231699] ssid=TestAP1(7), auth_type=8, inactivity_timeout=0
[ 9178.231773] IPv6: ADDRCONF(NETDEV_CHANGE): owl1: link becomes ready
[ 9188.280851] owl: owl2 receive beacon from owl1 (SSID: TestAP1, BSSID: 00:6f:77:6c:31:00)
[ 9188.280875] owl: owl2 receive beacon from owl0 (SSID: TestAP0, BSSID: 00:6f:77:6c:30:00)
[ 9198.520624] owl: owl2 receive beacon from owl0 (SSID: TestAP0, BSSID: 00:6f:77:6c:30:00)
[ 9198.520651] owl: owl2 receive beacon from owl1 (SSID: TestAP1, BSSID: 00:6f:77:6c:31:00)
[ 9208.760680] owl: owl2 receive beacon from owl1 (SSID: TestAP1, BSSID: 00:6f:77:6c:31:00)
[ 9208.760702] owl: owl2 receive beacon from owl0 (SSID: TestAP0, BSSID: 00:6f:77:6c:30:00)
[ 9219.000304] owl: owl2 receive beacon from owl0 (SSID: TestAP0, BSSID: 00:6f:77:6c:30:00)
[ 9219.000330] owl: owl2 receive beacon from owl1 (SSID: TestAP1, BSSID: 00:6f:77:6c:31:00)

TODO: 改進 2GHz channel/rate 的描述

參照 mac80211_hwsim

TODO: 模擬 rate 和 mcs

owl_get_station 函式中,應當模擬 rate 和 mcs (Modulation Coding Scheme),參照 mac80211_hwsim

TODO: 將 vwifi 的說明提交到 LKMPG

撰寫 vwifi 運作原理的英文描述,日後可整合進 The Linux Kernel Module Programming Guide