Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 63e5d41b authored by Eric Miao's avatar Eric Miao
Browse files

libandroidfw: use map for more efficient indexOfString()

Bug: 249320624

There number of calls to ResStringPool::indexOfString() is very high
due to many calls to Resources.getIdentifier() in JAVA code. This is
used to get the resource ID (a 32-bit integer) by its resource name
(a string).

One example pprof trace is at:

  https://pprof.corp.google.com/?id=9b9b377bf52bdeb972c6ad154acf3902

This happens to the key string pool in a package, which is normally
encoded in UTF-8, unsorted. The call to indexOfString() ends up going
through the whole key string pool and do a per string comparison til
the string is found.

For some large applications, the key string pool could have > 10k+
strings, which results in an excessive calling to string8At() or
stringAt(). Although these calls are pretty quick, the excessive
callings are not necessary and inefficient.

There are a few ways to improve this:

1. sort the key strings so the current implementation ends up doing
a binary search. However, for some reason, there is no code in aapt
or aapt2 to support sorting and use the SORTED_FLAG.

2. or when needed, setup a std::map<StringPiece, size_t> to map the
key string to its ID, and use the map for all subsequent calls to
indexOfString()

Here we chose the 2nd way with a few justifications:

1. std::map<> has a stable O(logN) lookup performance, and a key
comparison doesn't involve comparing the whole key so it's both
fast and relative stable.

2. by using StringPiece, we are not allocating memory if the string
mapped doesn't need decoding, and StringPiece supports comparison.

This CL reduces the number of calls into string8At() or stringAt()
to the key string pool from 2m+ down to ~400k.

See example pprof trace at:

  https://pprof.corp.google.com/?id=cb677fca932260b2dc7db64ee37c4b61

Change-Id: I77a0ef13daac6d8a5c63a681c52f8ab9712578ed
parent 2faf0508
Loading
Loading
Loading
Loading
+51 −123
Original line number Diff line number Diff line
@@ -1020,141 +1020,69 @@ base::expected<incfs::map_ptr<ResStringPool_span>, NullOrIOError> ResStringPool:
    return base::unexpected(std::nullopt);
}

base::expected<size_t, NullOrIOError> ResStringPool::indexOfString(const char16_t* str,
                                                                   size_t strLen) const
{
    if (mError != NO_ERROR) {
        return base::unexpected(std::nullopt);
    }

    if ((mHeader->flags&ResStringPool_header::UTF8_FLAG) != 0) {
        if (kDebugStringPoolNoisy) {
            ALOGI("indexOfString UTF-8: %s", String8(str, strLen).string());
        }

        // The string pool contains UTF 8 strings; we don't want to cause
        // temporary UTF-16 strings to be created as we search.
        if (mHeader->flags&ResStringPool_header::SORTED_FLAG) {
            // Do a binary search for the string...  this is a little tricky,
            // because the strings are sorted with strzcmp16().  So to match
            // the ordering, we need to convert strings in the pool to UTF-16.
            // But we don't want to hit the cache, so instead we will have a
            // local temporary allocation for the conversions.
            size_t convBufferLen = strLen + 4;
            std::vector<char16_t> convBuffer(convBufferLen);
            ssize_t l = 0;
            ssize_t h = mHeader->stringCount-1;

            ssize_t mid;
            while (l <= h) {
                mid = l + (h - l)/2;
                int c = -1;
                const base::expected<StringPiece, NullOrIOError> s = string8At(mid);
                if (UNLIKELY(IsIOError(s))) {
                    return base::unexpected(s.error());
                }
                if (s.has_value()) {
                    char16_t* end = utf8_to_utf16(reinterpret_cast<const uint8_t*>(s->data()),
                                                  s->size(), convBuffer.data(), convBufferLen);
                    c = strzcmp16(convBuffer.data(), end-convBuffer.data(), str, strLen);
                }
                if (kDebugStringPoolNoisy) {
                    ALOGI("Looking at %s, cmp=%d, l/mid/h=%d/%d/%d\n",
                          s->data(), c, (int)l, (int)mid, (int)h);
                }
                if (c == 0) {
                    if (kDebugStringPoolNoisy) {
                        ALOGI("MATCH!");
                    }
                    return mid;
                } else if (c < 0) {
                    l = mid + 1;
template <typename TChar, typename SP>
base::expected<size_t, NullOrIOError> ResStringPool::stringIndex(
        SP sp, std::unordered_map<SP, size_t>& map) const
{
    AutoMutex lock(mStringIndexLock);

    if (map.empty()) {
        // build string index on the first call
        for (size_t i = 0; i < mHeader->stringCount; i++) {
            base::expected<SP, NullOrIOError> s;
            if constexpr(std::is_same_v<TChar, char16_t>) {
                s = stringAt(i);
            } else {
                    h = mid - 1;
                s = string8At(i);
            }
            if (s.has_value()) {
                const auto r = map.insert({*s, i});
                if (!r.second) {
                    ALOGE("failed to build string index, string id=%zu\n", i);
                }
            } else {
            // It is unusual to get the ID from an unsorted string block...
            // most often this happens because we want to get IDs for style
            // span tags; since those always appear at the end of the string
            // block, start searching at the back.
            String8 str8(str, strLen);
            const size_t str8Len = str8.size();
            for (int i=mHeader->stringCount-1; i>=0; i--) {
                const base::expected<StringPiece, NullOrIOError> s = string8At(i);
                if (UNLIKELY(IsIOError(s))) {
                return base::unexpected(s.error());
            }
                if (s.has_value()) {
                    if (kDebugStringPoolNoisy) {
                        ALOGI("Looking at %s, i=%d\n", s->data(), i);
                    }
                    if (str8Len == s->size()
                            && memcmp(s->data(), str8.string(), str8Len) == 0) {
                        if (kDebugStringPoolNoisy) {
                            ALOGI("MATCH!");
                        }
                        return i;
        }
    }

    if (!map.empty()) {
        const auto result = map.find(sp);
        if (result != map.end())
            return result->second;
    }
    return base::unexpected(std::nullopt);
}

    } else {
        if (kDebugStringPoolNoisy) {
            ALOGI("indexOfString UTF-16: %s", String8(str, strLen).string());
base::expected<size_t, NullOrIOError> ResStringPool::indexOfString(const char16_t* str,
                                                                   size_t strLen) const
{
    if (mError != NO_ERROR) {
        return base::unexpected(std::nullopt);
    }

        if (mHeader->flags&ResStringPool_header::SORTED_FLAG) {
            // Do a binary search for the string...
            ssize_t l = 0;
            ssize_t h = mHeader->stringCount-1;

            ssize_t mid;
            while (l <= h) {
                mid = l + (h - l)/2;
                const base::expected<StringPiece16, NullOrIOError> s = stringAt(mid);
                if (UNLIKELY(IsIOError(s))) {
                    return base::unexpected(s.error());
                }
                int c = s.has_value() ? strzcmp16(s->data(), s->size(), str, strLen) : -1;
                if (kDebugStringPoolNoisy) {
                    ALOGI("Looking at %s, cmp=%d, l/mid/h=%d/%d/%d\n",
                          String8(s->data(), s->size()).string(), c, (int)l, (int)mid, (int)h);
                }
                if (c == 0) {
    if (kDebugStringPoolNoisy) {
                        ALOGI("MATCH!");
                    }
                    return mid;
                } else if (c < 0) {
                    l = mid + 1;
                } else {
                    h = mid - 1;
                }
        ALOGI("indexOfString (%s): %s", isUTF8() ? "UTF-8" : "UTF-16",
                String8(str, strLen).string());
    }

    base::expected<size_t, NullOrIOError> idx;
    if (isUTF8()) {
        auto str8 = String8(str, strLen);
        idx = stringIndex<char>(StringPiece(str8.c_str(), str8.size()), mStringIndex8);
    } else {
            // It is unusual to get the ID from an unsorted string block...
            // most often this happens because we want to get IDs for style
            // span tags; since those always appear at the end of the string
            // block, start searching at the back.
            for (int i=mHeader->stringCount-1; i>=0; i--) {
                const base::expected<StringPiece16, NullOrIOError> s = stringAt(i);
                if (UNLIKELY(IsIOError(s))) {
                    return base::unexpected(s.error());
        idx = stringIndex<char16_t>(StringPiece16(str, strLen), mStringIndex16);
    }
                if (kDebugStringPoolNoisy) {
                    ALOGI("Looking at %s, i=%d\n", String8(s->data(), s->size()).string(), i);

    if (UNLIKELY(!idx.has_value())) {
        return base::unexpected(idx.error());
    }
                if (s.has_value() && strLen == s->size() &&
                        strzcmp16(s->data(), s->size(), str, strLen) == 0) {

    if (*idx < mHeader->stringCount) {
        if (kDebugStringPoolNoisy) {
                        ALOGI("MATCH!");
                    }
                    return i;
                }
            }
            ALOGI("MATCH! (idx=%zu)", *idx);
        }
        return *idx;
    }
    return base::unexpected(std::nullopt);
}
+10 −0
Original line number Diff line number Diff line
@@ -41,6 +41,7 @@
#include <array>
#include <map>
#include <memory>
#include <unordered_map>

namespace android {

@@ -562,8 +563,17 @@ private:
    incfs::map_ptr<uint32_t>                      mStyles;
    uint32_t                                      mStylePoolSize;    // number of uint32_t

    // mStringIndex is used to quickly map a string to its ID
    mutable Mutex                                       mStringIndexLock;
    mutable std::unordered_map<StringPiece, size_t>     mStringIndex8;
    mutable std::unordered_map<StringPiece16, size_t>   mStringIndex16;

    base::expected<StringPiece, NullOrIOError> stringDecodeAt(
        size_t idx, incfs::map_ptr<uint8_t> str, size_t encLen) const;

    template <typename TChar, typename SP=BasicStringPiece<TChar>>
    base::expected<size_t, NullOrIOError> stringIndex(
        SP str, std::unordered_map<SP, size_t>& map) const;
};

/**