天天看點

Android 10 路由添加原理

以 WiFi 連接配接網絡的過程為例分析路由表添加原理

WiFi 工作在 client 模式時會調用 ClientModeImpl 狀态機,進入 ConnectModeState 狀态,進而調用 setupClientMode 函數:

[ClientModeImpl.java]

private void setupClientMode() {
    ...
    updateDataInterface();
    ...
}
           

updateDataInterface 函數的實作:

private void updateDataInterface() {
    ...
    mIpClientCallbacks = new IpClientCallbacksImpl();    // 建立回調,由服務端調用
    mFacade.makeIpClient(mContext, mDataInterfaceName, mIpClientCallbacks);    // 通過 netstack 服務建立 ipclient
    if (!mIpClientCallbacks.awaitCreation()) {
        loge("Timeout waiting for IpClient");
    }
    ...
}
           

makeIpClient 會調用 netstack 服務的 makeIpClient 函數:

@Override
public void makeIpClient(String ifName, IIpClientCallbacks cb) throws RemoteException {
    checkNetworkStackCallingPermission();
    updateSystemAidlVersion(cb.getInterfaceVersion());
    final IpClient ipClient = new IpClient(mContext, ifName, cb, mObserverRegistry, this);

    synchronized (mIpClients) {
        final Iterator<WeakReference<IpClient>> it = mIpClients.iterator();
        while (it.hasNext()) {
            final IpClient ipc = it.next().get();
            if (ipc == null) {
                it.remove();
            }
        }
        mIpClients.add(new WeakReference<>(ipClient));
    }

    cb.onIpClientCreated(ipClient.makeConnector());
}
           

這裡值得注意的是 mObserverRegistry 在哪裡初始化?答案是在 NetworkStackConnector 的構造函數中:

NetworkStackConnector(Context context) {
    mContext = context;
    mNetd = INetd.Stub.asInterface(
            (IBinder) context.getSystemService(Context.NETD_SERVICE));    // 擷取 netd 服務
    mObserverRegistry = new NetworkObserverRegistry();
    mCm = context.getSystemService(ConnectivityManager.class);
    mIpMemoryStoreService = new IpMemoryStoreService(context);

    int netdVersion;
    try {
        netdVersion = mNetd.getInterfaceVersion();
    } catch (RemoteException e) {
        mLog.e("Error obtaining INetd version", e);
        netdVersion = -1;
    }
    mNetdAidlVersion = netdVersion;

    try {
        mObserverRegistry.register(mNetd);
    } catch (RemoteException e) {
        mLog.e("Error registering observer on Netd", e);
    }
}
           

NetworkStackConnector 是 netstack 服務所有實作。上面的構造函數中,先擷取了 netd 服務,然後初始化了mObserverRegistry,緊接着調用其 register 函數:

public class NetworkObserverRegistry extends INetdUnsolicitedEventListener.Stub {
    ...
    void register(@NonNull INetd netd) throws RemoteException {
        netd.registerUnsolicitedEventListener(this);
    }
    ...
}
           

register 函數向 netd 服務注冊事件回調,而 mObserverRegistry 本身就是回調對象,通過 NetworkObserverRegistry 類的繼承關系可見一斑。那麼看一下 netd 中注冊事件回調函數:

binder::Status NetdNativeService::registerUnsolicitedEventListener(
        const android::sp<android::net::INetdUnsolicitedEventListener>& listener) {
    ENFORCE_NETWORK_STACK_PERMISSIONS();
    gCtls->eventReporter.registerUnsolEventListener(listener);
    return binder::Status::ok();
}
           

在 EventReporter 類中找到 registerUnsolEventListener 成員函數的定義:

void EventReporter::registerUnsolEventListener(
        const android::sp<INetdUnsolicitedEventListener>& listener) {
    ...
    mUnsolListenerMap.insert({listener, deathRecipient});
}
           

這裡的注冊僅将 listener 添加到 mUnsolListenerMap 容器中,通過 getNetdUnsolicitedEventListenerMap 可以擷取已注冊的 listener:

EventReporter::UnsolListenerMap EventReporter::getNetdUnsolicitedEventListenerMap() const {
    std::lock_guard lock(mUnsolicitedMutex);
    return mUnsolListenerMap;
}
           

具體的 listener 在哪裡使用呢?這必須先看 NetlinkManager 的實作,在 NetlinkManager中通過 netlink 與 kernel 進行互動,并且通過 NetlinkHandler 處理 kernel 上報的 event。

NetlinkHandler *NetlinkManager::setupSocket(int *sock, int netlinkFamily,
    int groups, int format, bool configNflog) {
    ...
    nladdr.nl_family = AF_NETLINK;
    // Kernel will assign a unique nl_pid if set to zero.
    nladdr.nl_pid = 0;    // 與 kernel 通信
    nladdr.nl_groups = groups;

    if ((*sock = socket(PF_NETLINK, SOCK_DGRAM | SOCK_CLOEXEC, netlinkFamily)) < 0) {    // 建立 netlink 套接字
        ALOGE("Unable to create netlink socket for family %d: %s", netlinkFamily, strerror(errno));
        return nullptr;
    }
    ...
    NetlinkHandler *handler = new NetlinkHandler(this, *sock, format);    // 建立 NetlinkHandler 接收消息
    if (handler->start()) {
        ALOGE("Unable to start NetlinkHandler: %s", strerror(errno));
        close(*sock);
        return nullptr;
    }
    ...
}
           

我們關心的是 NetlinkHandler 處理消息的部分:

class NetlinkHandler : public ::NetlinkListener {
  ...
  protected:
    virtual void onEvent(NetlinkEvent *evt);
}
           

首先 NetlinkHandler 繼承至 NetlinkListener,當 NetlinkListener 監聽的套接字收到 kernel 傳來的消息時,會調用子類的 onEvent 處理消息:

void NetlinkHandler::onEvent(NetlinkEvent *evt) {
    const char *subsys = evt->getSubsystem();
    if (!subsys) {
        ALOGW("No subsystem found in netlink event");
        return;
    }

    if (!strcmp(subsys, "net")) {    // 判斷為網絡子系統
        NetlinkEvent::Action action = evt->getAction();
        const char *iface = evt->findParam("INTERFACE");
        ...
        if (action == NetlinkEvent::Action::kRouteUpdated ||
                   action == NetlinkEvent::Action::kRouteRemoved) {
            const char *route = evt->findParam("ROUTE");
            const char *gateway = evt->findParam("GATEWAY");
            const char *iface = evt->findParam("INTERFACE");
            if (route && (gateway || iface)) {
                notifyRouteChange((action == NetlinkEvent::Action::kRouteUpdated) ? true : false,
                                  route, (gateway == nullptr) ? "" : gateway,
                                  (iface == nullptr) ? "" : iface);
            }
        }
        ...
    }
}
           

但路由發生變化時,kernel 會上報 kRouteUpdated 事件消息,這時會調用 notifyRouteChange 處理消息:

void NetlinkHandler::notifyRouteChange(bool updated, const std::string& route,
                                       const std::string& gateway, const std::string& ifName) {
    LOG_EVENT_FUNC(BINDER_RETRY, onRouteChanged, updated, route, gateway, ifName);
}
           

LOG_EVENT_FUNC 宏定義如下:

#define LOG_EVENT_FUNC(retry, func, ...)                                                    \
    do {                                                                                    \
        const auto listenerMap = gCtls->eventReporter.getNetdUnsolicitedEventListenerMap(); \
        for (auto& listener : listenerMap) {                                                \
            auto entry = gUnsolicitedLog.newEntry().function(#func).args(__VA_ARGS__);      \
            if (retry(listener.first->func(__VA_ARGS__))) {                                 \
                gUnsolicitedLog.log(entry.withAutomaticDuration());                         \
            }                                                                               \
        }                                                                                   \
    } while (0)
           

這裡會周遊之前注冊的 listener,并調用 listen 的 func 函數,宏展開後對應的會調用 onRouteChanged 函數。此時會回調 mObserverRegistry 的 onRouteChanged 函數:

public void onRouteChanged(boolean updated, String route, String gateway, String ifName) {
    final RouteInfo processRoute = new RouteInfo(new IpPrefix(route),
            ("".equals(gateway)) ? null : InetAddresses.parseNumericAddress(gateway),
            ifName, RTN_UNICAST);
    if (updated) {
        invokeForAllObservers(o -> o.onRouteUpdated(processRoute));
    } else {
        invokeForAllObservers(o -> o.onRouteRemoved(processRoute));
    }
}
           

invokeForAllObservers 會周遊之前注冊過的 observer,并調用 observer 的 onRouteUpdated 函數。

observer 又是在哪裡注冊呢?這又得回到 makeIpClient 函數建立 IpClient 對象說起,在 IpClient 的構造函數中注冊了 mLinkObserver:

IpClient(Context context, String ifName, IIpClientCallbacks callback,
        NetworkObserverRegistry observerRegistry, NetworkStackServiceManager nssManager,
        Dependencies deps) {
    ...
    mLinkObserver = new IpClientLinkObserver(
            mInterfaceName,
            () -> sendMessage(EVENT_NETLINK_LINKPROPERTIES_CHANGED)) {
        @Override
        public void onInterfaceAdded(String iface) {
            super.onInterfaceAdded(iface);
            ...
        }


        @Override
        public void onInterfaceRemoved(String iface) {
            super.onInterfaceRemoved(iface);
            ...
        }
    };
    ...
    startStateMachineUpdaters();
}
           

startStateMachineUpdaters 會注冊前面定義的 mLinkObserver:

private void startStateMachineUpdaters() {
    mObserverRegistry.registerObserverForNonblockingCallback(mLinkObserver);
}
           

繼續分析 kRouteUpdated 消息的傳遞,經過前面分析 mLinkObserver 的 onRouteUpdated 函數會被調用:

public void onRouteUpdated(RouteInfo route) {
    if (mInterfaceName.equals(route.getInterface())) {
        maybeLog("routeUpdated", route);
        boolean changed;
        synchronized (this) {
            changed = mLinkProperties.addRoute(route);
        }
        if (changed) {
            mCallback.update();
        }
    }
}
           

這裡的 mCallback 為 IpClientLinkObserver 構造函數中傳遞 lamda 表達式,是以會調用如下語句:

在 IpClient 中處理消息:

case EVENT_NETLINK_LINKPROPERTIES_CHANGED:
    handleLinkPropertiesUpdate(NO_CALLBACKS);
    break;
           

handleLinkPropertiesUpdate 中會分發消息:

dispatchCallback 函數中預設的消息處理如下:

mCallback 為 IpClient 的成員變量,其值為 ClientModeImpl 出過來的 mIpClientCallbacks 對象的包裝。是以 IpClientCallbacksImpl 的 onLinkPropertiesChange 會被調用:

public void onLinkPropertiesChange(LinkProperties newLp) {
    sendMessage(CMD_UPDATE_LINKPROPERTIES, newLp);
}
           

此消息通過 mNetworkAgent.sendLinkProperties(mLinkProperties) 傳給 ConnectivityService 處理:

case NetworkAgent.EVENT_NETWORK_PROPERTIES_CHANGED: {
    handleUpdateLinkProperties(nai, (LinkProperties) msg.obj);
    break;
}
           

handleUpdateLinkProperties 會調用 updateLinkProperties 最終處理消息:

private void updateLinkProperties(NetworkAgentInfo networkAgent, LinkProperties newLp,
            LinkProperties oldLp) {
    ...
    updateRoutes(newLp, oldLp, netId);	// 更新路由表
    ...
}
           

小結:

  • 注冊 netlink 消息過程:

    ClientModeImpl -> netstack -> netd -> kernel

  • 路由添加事件傳遞過程:

    kernel -> netd -> NetworkObserverRegistry -> IpClientCallbacksImpl

updateRoutes 接下來的工作就建立路由表并添加路由(隻考慮添加的情況)。

private boolean updateRoutes(LinkProperties newLp, LinkProperties oldLp, int netId) {
    // Compare the route diff to determine which routes should be added and removed.
    CompareResult<RouteInfo> routeDiff = new CompareResult<RouteInfo>(
            oldLp != null ? oldLp.getAllRoutes() : null,
            newLp != null ? newLp.getAllRoutes() : null);
    for (RouteInfo route : routeDiff.added) {
        if (route.hasGateway()) continue;
        if (VDBG || DDBG) log("Adding Route [" + route + "] to network " + netId);
        try {
            mNMS.addRoute(netId, route);    // 添加路由
        } catch (Exception e) {
            if ((route.getDestination().getAddress() instanceof Inet4Address) || VDBG) {
                loge("Exception in addRoute for non-gateway: " + e);
            }
        }
    }
    for (RouteInfo route : routeDiff.added) {
        if (route.hasGateway() == false) continue;
        if (VDBG || DDBG) log("Adding Route [" + route + "] to network " + netId);
        try {
            mNMS.addRoute(netId, route);    // 删除路由
        } catch (Exception e) {
            if ((route.getGateway() instanceof Inet4Address) || VDBG) {
                loge("Exception in addRoute for gateway: " + e);
            }
        }
    }

    for (RouteInfo route : routeDiff.removed) {
        if (VDBG || DDBG) log("Removing Route [" + route + "] from network " + netId);
        try {
            mNMS.removeRoute(netId, route);
        } catch (Exception e) {
            loge("Exception in removeRoute: " + e);
        }
    }
    return !routeDiff.added.isEmpty() || !routeDiff.removed.isEmpty();
}
           

mNMS 為 INetworkManagementService 代理端對象,服務端實作了 addRoute 函數:

[NetworkManagementService.java]

public void addRoute(int netId, RouteInfo route) {
    modifyRoute(MODIFY_OPERATION_ADD, netId, route);
}
           

modifyRoute 函數的實作:

private void modifyRoute(boolean add, int netId, RouteInfo route) {
    ...
        if (add) {
            mNetdService.networkAddRoute(netId, ifName, dst, nextHop);
        } else {
            mNetdService.networkRemoveRoute(netId, ifName, dst, nextHop);
        }
    ...
}
           

mNetdService 為 netd 服務的代理,那麼會調用服務端的 networkAddRoute 函數:

[NetdNativeService.cpp]

binder::Status NetdNativeService::networkAddRoute(int32_t netId, const std::string& ifName,
                                                  const std::string& destination,
                                                  const std::string& nextHop) {
    ...
    int res = gCtls->netCtrl.addRoute(netId, ifName.c_str(), destination.c_str(),
                                      nextHop.empty() ? nullptr : nextHop.c_str(), legacy, uid);
    return statusFromErrcode(res);
}
           

NetworkController 中的 addRoute 實作:

[NetworkController.cpp]

int NetworkController::addRoute(unsigned netId, const char* interface, const char* destination,
                                const char* nexthop, bool legacy, uid_t uid) {
    return modifyRoute(netId, interface, destination, nexthop, true, legacy, uid);
}
           

NetworkController 中 modifyRoute 源碼:

int NetworkController::modifyRoute(unsigned netId, const char* interface, const char* destination,
                                   const char* nexthop, bool add, bool legacy, uid_t uid) {
    ...
    return add ? RouteController::addRoute(interface, destination, nexthop, tableType) :
                 RouteController::removeRoute(interface, destination, nexthop, tableType);
}
           

RouteController 中的 addRoute 實作:

[RouteController.cpp]

int RouteController::addRoute(const char* interface, const char* destination, const char* nexthop,
                              TableType tableType) {
    return modifyRoute(RTM_NEWROUTE, interface, destination, nexthop, tableType);
}
           

RouteController 中的 modifyRoute 實作:

WARN_UNUSED_RESULT int RouteController::modifyRoute(uint16_t action, const char* interface,
                                                    const char* destination, const char* nexthop,
                                                    TableType tableType) {
    uint32_t table;
    switch (tableType) {    // 路由表的類型
        case RouteController::INTERFACE: {    // 每一個網卡對應一張路由表
            table = getRouteTableForInterface(interface);    // 通過網卡名找到對應的路由表
            if (table == RT_TABLE_UNSPEC) {
                return -ESRCH;
            }
            break;
        }
        case RouteController::LOCAL_NETWORK: {
            table = ROUTE_TABLE_LOCAL_NETWORK;
            break;
        }
        case RouteController::LEGACY_NETWORK: {
            table = ROUTE_TABLE_LEGACY_NETWORK;
            break;
        }
        case RouteController::LEGACY_SYSTEM: {
            table = ROUTE_TABLE_LEGACY_SYSTEM;
            break;
        }
    }

    int ret = modifyIpRoute(action, table, interface, destination, nexthop);    // 添加路由資訊
    // Trying to add a route that already exists shouldn't cause an error.
    if (ret && !(action == RTM_NEWROUTE && ret == -EEXIST)) {
        return ret;
    }

    return 0;
}
           

modifyIpRoute 函數是真正添加路由的地方,它通過 netlink 将路由添加到 kernel,具體代碼這裡不再分析。

WARN_UNUSED_RESULT int modifyIpRoute(uint16_t action, uint32_t table, const char* interface,
                                     const char* destination, const char* nexthop) {
    // At least the destination must be non-null.
    if (!destination) {
        ALOGE("null destination");
        return -EFAULT;
    }

    // Parse the prefix.
    uint8_t rawAddress[sizeof(in6_addr)];
    uint8_t family;
    uint8_t prefixLength;
    int rawLength = parsePrefix(destination, &family, rawAddress, sizeof(rawAddress),
                                &prefixLength);
    if (rawLength < 0) {
        ALOGE("parsePrefix failed for destination %s (%s)", destination, strerror(-rawLength));
        return rawLength;
    }

    if (static_cast<size_t>(rawLength) > sizeof(rawAddress)) {
        ALOGE("impossible! address too long (%d vs %zu)", rawLength, sizeof(rawAddress));
        return -ENOBUFS;  // Cannot happen; parsePrefix only supports IPv4 and IPv6.
    }

    uint8_t type = RTN_UNICAST;
    uint32_t ifindex;
    uint8_t rawNexthop[sizeof(in6_addr)];

    if (nexthop && !strcmp(nexthop, "unreachable")) {
        type = RTN_UNREACHABLE;
        // 'interface' is likely non-NULL, as the caller (modifyRoute()) likely used it to lookup
        // the table number. But it's an error to specify an interface ("dev ...") or a nexthop for
        // unreachable routes, so nuke them. (IPv6 allows them to be specified; IPv4 doesn't.)
        interface = OIF_NONE;
        nexthop = nullptr;
    } else if (nexthop && !strcmp(nexthop, "throw")) {
        type = RTN_THROW;
        interface = OIF_NONE;
        nexthop = nullptr;
    } else {
        // If an interface was specified, find the ifindex.
        if (interface != OIF_NONE) {
            ifindex = if_nametoindex(interface);
            if (!ifindex) {
                ALOGE("cannot find interface %s", interface);
                return -ENODEV;
            }
        }

        // If a nexthop was specified, parse it as the same family as the prefix.
        if (nexthop && inet_pton(family, nexthop, rawNexthop) <= 0) {
            ALOGE("inet_pton failed for nexthop %s", nexthop);
            return -EINVAL;
        }
    }

    bool isDefaultThrowRoute = (type == RTN_THROW && prefixLength == 0);

    // Assemble a rtmsg and put it in an array of iovec structures.
    rtmsg route = {
        .rtm_protocol = RTPROT_STATIC,
        .rtm_type = type,
        .rtm_family = family,
        .rtm_dst_len = prefixLength,
        .rtm_scope = static_cast<uint8_t>(nexthop ? RT_SCOPE_UNIVERSE : RT_SCOPE_LINK),
    };

    rtattr rtaDst     = { U16_RTA_LENGTH(rawLength), RTA_DST };
    rtattr rtaGateway = { U16_RTA_LENGTH(rawLength), RTA_GATEWAY };

    iovec iov[] = {
        { nullptr,          0 },
        { &route,        sizeof(route) },
        { &RTATTR_TABLE, sizeof(RTATTR_TABLE) },
        { &table,        sizeof(table) },
        { &rtaDst,       sizeof(rtaDst) },
        { rawAddress,    static_cast<size_t>(rawLength) },
        { &RTATTR_OIF,   interface != OIF_NONE ? sizeof(RTATTR_OIF) : 0 },
        { &ifindex,      interface != OIF_NONE ? sizeof(ifindex) : 0 },
        { &rtaGateway,   nexthop ? sizeof(rtaGateway) : 0 },
        { rawNexthop,    nexthop ? static_cast<size_t>(rawLength) : 0 },
        { &RTATTR_PRIO,  isDefaultThrowRoute ? sizeof(RTATTR_PRIO) : 0 },
        { &PRIO_THROW,   isDefaultThrowRoute ? sizeof(PRIO_THROW) : 0 },
    };

    uint16_t flags = (action == RTM_NEWROUTE) ? NETLINK_ROUTE_CREATE_FLAGS : NETLINK_REQUEST_FLAGS;

    // Allow creating multiple link-local routes in the same table, so we can make IPv6
    // work on all interfaces in the local_network table.
    if (family == AF_INET6 && IN6_IS_ADDR_LINKLOCAL(reinterpret_cast<in6_addr*>(rawAddress))) {
        flags &= ~NLM_F_EXCL;
    }

    int ret = sendNetlinkRequest(action, flags, iov, ARRAY_SIZE(iov), nullptr);
    if (ret) {
        ALOGE("Error %s route %s -> %s %s to table %u: %s",
              actionName(action), destination, nexthop, interface, table, strerror(-ret));
    }
    return ret;
}
           

至此已分析完路由添加的全過程。

疑問:

路由是 kernel 自動添加,然後通過 netlink 事件上報給上層,為何上層又要執行添加路由的動作,這不是重複了嗎?

這是因為 linux 使用了政策路由,可以存在多張路由表,kernel 自動添加的路由會添加到主路由表中。上層添加的路由會建立自己的表。網絡通訊時,通過政策選擇适合的路由表。

繼續閱讀