iOS class_ro_t和class_rw_t的区别 category加载过程解析

本文主要介绍class_ro_t和class_rw_t的区别、分类加载过程以及多个分类加载的问题

class_ro_t

class_ro_t 存储了当前类在编译期就已经确定的 属性 方法 以及 遵循的协议 ,里面是没有分类的方法的。那些运行时添加的方法将会存储在运行时生成的 class_rw_t 中。

ro 即表示 read only ,是无法进行修改的。

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif
    const uint8_t * ivarLayout;
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;
    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
    method_list_t *baseMethods() const {
        return baseMethodList;

class_rw_t

ObjC 类中的属性、方法还有遵循的协议等信息都保存在 class_rw_t中:

// 可读可写
struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;
    const class_ro_t *ro; // 指向只读的结构体,存放类初始信息
     这三个都是二维数组,是可读可写的,包含了类的初始内容、分类的内容。
     methods中,存储 method_list_t ----> method_t
     二维数组,method_list_t --> method_t
     这三个二位数组中的数据有一部分是从class_ro_t中合并过来的。
    method_array_t methods; // 方法列表(类对象存放对象方法,元类对象存放类方法)
    property_array_t properties; // 属性列表
    protocol_array_t protocols; //协议列表
    Class firstSubclass;
    Class nextSiblingClass;
    //...

class_rw_t生成时机

class_rw_t生成在运行时,在编译期间,class_ro_t结构体就已经确定,objc_class中的bits的data部分存放着该结构体的地址。在runtime运行之后,具体说来是在运行runtimerealizeClass方法时,会生成class_rw_t结构体,该结构体包含了class_ro_t,并且更新data部分,换成class_rw_t`结构体的地址。

类的realizeClass运行之前:

在这里插入图片描述
细看两个结构体的成员变量会发现很多相同的地方,他们都存放着当前类的属性、实例变量、方法、协议等等。区别在于:class_ro_t存放的是编译期间就确定的;而class_rw_t是在runtime时才确定,它会先将class_ro_t的内容拷贝过去,然后再将当前类的分类的这些属性、方法等拷贝到其中。所以可以说class_rw_tclass_ro_t的超集,当然实际访问类的方法、属性等也都是访问的class_rw_t中的内容

摘自 https://www.jianshu.com/p/823eaedb3697

分类方法加载到class_rw_t的流程

  • 程序启动后,通过编译之后,Runtime 会进行初始化,调用 _objc_init
  • _objc_init`由`dyld`驱动,这个阶段会注册`3`个回调,分别是`mapped`,`init`,`unmapped
    /***********************************************************************
    * _objc_init
    * Bootstrap initialization. Registers our image notifier with dyld.
    * Called by libSystem BEFORE library initialization time
    **********************************************************************/
    void _objc_init(void)
        static bool initialized = false;
        if (initialized) return;
        initialized = true;
        // fixme defer initialization until an objc-using image is found?
        environ_init(); //环境调试 例如僵尸模式设置后,就是在这里起作用的
        tls_init(); //tls指的是局部线程存储,可以将数据存储在线程一个公共区域,例如pthread_setspecific(),在autoreleasepool和堆栈信息获取时都有涉及
        static_init(); //执行c++静态构造函数
        lock_init(); //这里获取两个的线程优先级 后台优先级线程以及主线程
        exception_init(); //这里初始化libobjc的exception处理系统
        _dyld_objc_notify_register(&map_images, load_images, unmap_image);
    123456789101112131415161718192021
    

    比较核心的是_dyld_objc_notify_register方法

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
    

    map_images

  • 然后会 map_images
  • map_images(unsigned count, const char * const paths[], const struct mach_header * const mhdrs[]) rwlock_writer_t lock(runtimeLock); return map_images_nolock(count, paths, mhdrs); 1234567
  • 接下来调用map_images_nolock
  • map_images_nolock(unsigned mhCount, const char * const mhPaths[], const struct mach_header * const mhdrs[]) //.... 略去一大块 if (hCount > 0) { _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses); //... 1234567891011
  • 再然后就是 _read_images,这个方法会读取所有的类的相关信息。根据注释,_read_images 方法主要做了下面这些事情:
  • _read_images 方法写了很长,其实就是做了一件事,将Mach-O文件的section依次读取,并根据内容初始化runtime的内存结构。

  • 是否需要禁用isa优化。这里有三种情况:使用了swift 3.0前的swift代码。OSX版本早于10.11。在OSX系统下,Mach-ODATA段明确指明了__objc_rawisa(不使用优化的isa).
  • 苹果从ARM64位架构开始,对isa进行了优化,将其定义成一个共用体(union)结构,结合 位域 的概念以及 位运算 的方式来存储更多类相关信息。isa指针需要通过与一个叫ISA_MASK的值(掩码)进行二进制&运算,才能得到真实的class/meta-class对象的地址。
    参考文章 https://www.jianshu.com/p/30de582dbeb7

  • 判断是否禁用了tagged pointer
  • __objc_classlist section中读取class list
  • __objc_classrefs section中读取class 引用的信息,并调用remapClassRef方法来处理。
  • __objc_selrefs section中读取selector的引用信息,并调用sel_registerNameNoLock方法处理。
  • __objc_protolist section中读取clsProtocol信息,并调用readProtocol方法来读取Protocol信息。
  • __objc_protorefs section中读取protocol的ref信息,并调用remapProtocolRef方法来处理。
  • __objc_nlclslist section中读取non-lazy class信息,并调用static Class realizeClass(Class cls)方法来实现这些class。realizeClass方法核心是初始化objc_class数据结构,赋予初始值。
  • __objc_catlist section中读取category信息,并调用addUnattachedCategoryForClass方法来为类或元类添加对应的方法,属性和协议。
  • 调用 reMethodizeClass:,这个方法是重新方法化的意思。
  • reMethodizeClass:方法内部会调用attachCategories: ,这个方法会传入 ClassCategory,会将方法列表,协议列表等与原有的类合并。最后加入到 class_rw_t 结构体中。
  • load_images

    构造好 class_rw_t 之后,load_images 调用 call_load_methods 就是开始调用类的+load方法和分类的+load方法了

    /***********************************************************************
    * call_load_methods
    * Call all pending class and category +load methods.
    * Class +load methods are called superclass-first. 
    * Category +load methods are not called until after the parent class's +load.
    * This method must be RE-ENTRANT, because a +load could trigger 
    * more image mapping. In addition, the superclass-first ordering 
    * must be preserved in the face of re-entrant calls. Therefore, 
    * only the OUTERMOST call of this function will do anything, and 
    * that call will handle all loadable classes, even those generated 
    * while it was running.
    * The sequence below preserves +load ordering in the face of 
    * image loading during a +load, and make sure that no 
    * +load method is forgotten because it was added during 
    * a +load call.
    * Sequence:
    * 1. Repeatedly call class +loads until there aren't any more
    * 2. Call category +loads ONCE.
    * 3. Run more +loads if:
    *    (a) there are more classes to load, OR
    *    (b) there are some potential category +loads that have 
    *        still never been attempted.
    * Category +loads are only run once to ensure "parent class first" 
    * ordering, even if a category +load triggers a new loadable class 
    * and a new loadable category attached to that class. 
    * Locking: loadMethodLock must be held by the caller 
    *   All other locks must not be held.
    **********************************************************************/
    void call_load_methods(void)
        static bool loading = NO;
        bool more_categories;
        loadMethodLock.assertLocked();
        // Re-entrant calls do nothing; the outermost call will finish the job.
        if (loading) return;
        loading = YES;
        void *pool = objc_autoreleasePoolPush();
            // 1. Repeatedly call class +loads until there aren't any more
            while (loadable_classes_used > 0) {
                call_class_loads();
            // 2. Call category +loads ONCE
            more_categories = call_category_loads();
            // 3. Run more +loads if there are classes OR more untried categories
        } while (loadable_classes_used > 0  ||  more_categories);
        objc_autoreleasePoolPop(pool);
        loading = NO;
    

    unmap_image

    unmap_image` 调用 `_unload_image
    

    涉及一些资源的释放,例如 unattached list+load queue,将每个类分离后,进行释放

    /***********************************************************************
    * _unload_image
    * Only handles MH_BUNDLE for now.
    * Locking: write-lock and loadMethodLock acquired by unmap_image
    **********************************************************************/
    void _unload_image(header_info *hi)
        size_t count, i;
        loadMethodLock.assertLocked();
        runtimeLock.assertWriting();
        // Unload unattached categories and categories waiting for +load.
        category_t **catlist = _getObjc2CategoryList(hi, &count);
        for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            if (!cat) continue;  // category for ignored weak-linked class
            Class cls = remapClass(cat->cls);
            assert(cls);  // shouldn't have live category for dead class
            // fixme for MH_DYLIB cat's class may have been unloaded already
            // unattached list
            removeUnattachedCategoryForClass(cat, cls);
            // +load queue
            remove_category_from_loadable_list(cat);
        // Unload classes.
        // Gather classes from both __DATA,__objc_clslist 
        // and __DATA,__objc_nlclslist. arclite's hack puts a class in the latter
        // only, and we need to unload that class if we unload an arclite image.
        NXHashTable *classes = NXCreateHashTable(NXPtrPrototype, 0, nil);
        classref_t *classlist;
        classlist = _getObjc2ClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (cls) NXHashInsert(classes, cls);
        classlist = _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (cls) NXHashInsert(classes, cls);
        // First detach classes from each other. Then free each class.
        // This avoid bugs where this loop unloads a subclass before its superclass
        NXHashState hs;
        Class cls;
        hs = NXInitHashState(classes);
        while (NXNextHashState(classes, &hs, (void**)&cls)) {
            remove_class_from_loadable_list(cls);
            detach_class(cls->ISA(), YES);
            detach_class(cls, NO);
        hs = NXInitHashState(classes);
        while (NXNextHashState(classes, &hs, (void**)&cls)) {
            free_class(cls->ISA());
            free_class(cls);
        NXFreeHashTable(classes);
        // XXX FIXME -- Clean up protocols:
        // <rdar://problem/9033191> Support unloading protocols at dylib/image unload time
        // fixme DebugUnload
    

    两个category的load方法的加载顺序,两个category的同名方法的加载顺序

    +load 方法是 images 加载的时候调用,假设有一个 Person 类,其主类和所有分类的 +load 都会被调用,优先级是先调用主类,且如果主类有继承链,那么加载顺序还必须是基类的 +load ,接着是父类,最后是子类;category 的 +load 则是按照编译顺序来的,先编译的先调用,后编译的后调用,可在 Xcode 的 BuildPhase 中查看,测试 Demo 可点击下载运行

    另外一个问题是 initialize 的加载顺序,其实是类第一次被使用到的时候会被调用,底层实现有个逻辑先判断父类是否被初始化过,没有则先调用父类,然后在调用当前类的 initialize 方法;试想一种情况,一个类 A 存在多个 category ,且 category中各自实现了 initialize 方法,这时候走的是 消息发送流程,也就说 initialize 方法只会调用一次,也就是最后编译的那个category中的 initialize 方法,验证demo见上;

    再考虑一种情况:如果+load 方法中调用了其他类:比如 B 的某个方法,其实说白了就是走消息发送流程,由于 B 没有初始化过,则会调用其 initialize 方法,但此刻 B 的 +load 方法可能还没有被系统调用过。

    小结: 不管是 load 还是 initialize 方法都是 runtime 底层自动调用的,如果开发自己手动进行了 [super load] 或者 [super initialize] 方法,实际上是走消息发送流程,那么这里也涉及了一个调用流程,需要引起注意。

    ... -> realizeClass -> methodizeClass(用于Attach categories)-> attachCategories 关键就是在 methodizeClass 方法实现中

    static void methodizeClass(Class cls)
        runtimeLock.assertLocked();
        bool isMeta = cls->isMetaClass();
        auto rw = cls->data();
        auto ro = rw->ro;
        // =======================================
            // 省略.....
        // =======================================
        property_list_t *proplist = ro->baseProperties;
        if (proplist) {
            rw->properties.attachLists(&proplist, 1);
        // =======================================
            // 省略.....
        // =======================================
        // Attach categories.
        category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/);
        attachCategories(cls, cats, false /*don't flush caches*/);
        // =======================================
            // 省略.....
        // =======================================
        if (cats) free(cats);
    

    上面代码能确定 baseProperties 在前,category 在后,但决定顺序的是 rw->properties.attachLists 这个方法:

    property_list_t *proplist = ro->baseProperties;
    if (proplist) {
      rw->properties.attachLists(&proplist, 1);
    /// category 被附加进去
    void attachLists(List* const * addedLists, uint32_t addedCount) {
            if (addedCount == 0) return;
            if (hasArray()) {
                // many lists -> many lists
                uint32_t oldCount = array()->count;
                uint32_t newCount = oldCount + addedCount;
                setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
                array()->count = newCount;
                // 将旧内容移动偏移量 addedCount 然后将 addedLists copy 到起始位置
                    struct array_t {
                            uint32_t count;
                            List* lists[0];
                memmove(array()->lists + addedCount, array()->lists, 
                        oldCount * sizeof(array()->lists[0]));
                memcpy(array()->lists, addedLists, 
                       addedCount * sizeof(array()->lists[0]));
            else if (!list  &&  addedCount == 1) {
                // 0 lists -> 1 list
                list = addedLists[0];
            else {
                // 1 list -> many lists
                List* oldList = list;
                uint32_t oldCount = oldList ? 1 : 0;
                uint32_t newCount = oldCount + addedCount;
                setArray((array_t *)malloc(array_t::byteSize(newCount)));
                array()->count = newCount;
                if (oldList) array()->lists[addedCount] = oldList;
                memcpy(array()->lists, addedLists,