Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 5e5c7647 authored by Chalard Jean's avatar Chalard Jean
Browse files

[MS06] Add the relevance utils.

This models the relevance of the network.

Test: This contains new tests, and they pass.
Bug: 116512211
Change-Id: I09167532c4041a12abf749454138d56a764b22a6
parent 8fe503f1
Loading
Loading
Loading
Loading
+307 −0
Original line number Diff line number Diff line
/*
 * Copyright (C) 2018 The Android Open Source Project
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package com.android.server.net.ipmemorystore;

import com.android.internal.annotations.VisibleForTesting;

/**
 * A class containing the logic around the relevance value for
 * IP Memory Store.
 *
 * @hide
 */
public class RelevanceUtils {
    /**
     * The relevance is a decaying value that gets lower and lower until it
     * reaches 0 after some time passes. It follows an exponential decay law,
     * dropping slowly at first then faster and faster, because a network is
     * likely to be visited again if it was visited not long ago, and the longer
     * it hasn't been visited the more likely it is that it won't be visited
     * again. For example, a network visited on holiday should stay fresh for
     * the duration of the holiday and persist for a while, but after the venue
     * hasn't been visited for a while it should quickly be discarded. What
     * should accelerate forgetting the network is extended periods without
     * visits, so that occasional venues get discarded but regular visits keep
     * the network relevant, even if the visits are infrequent.
     *
     * This function must be stable by iteration, meaning that adjusting the same value
     * for different dates iteratively multiple times should give the same result.
     * Formally, if f is the decay function that associates a relevance x at a date d1
     * to the value at ulterior date d3, then for any date d2 between d1 and d3 :
     * f(x, d3 - d1) = f(f(x, d3 - d2), d2 - d1). Intuitively, this property simply
     * means it should be the same to compute and store back the value after two months,
     * or to do it once after one month, store it back, and do it again after another
     * months has passed.
     * The pair of the relevance and date define the entire curve, so any pair
     * of values on the curve will define the same curve. Setting one of them to a
     * constant, so as not to have to store it, means the other one will always suffice
     * to describe the curve. For example, only storing the date for a known, constant
     * value of the relevance is an efficient way of remembering this information (and
     * to compare relevances together, as f is monotonically decreasing).
     *
     *** Choosing the function :
     * Functions of the kind described above are standard exponential decay functions
     * like the ones that govern atomic decay where the value at any given date can be
     * computed uniformly from the value at a previous date and the time elapsed since
     * that date. It is simple to picture this kind of function as one where after a
     * given period of time called the half-life, the relevance value will have been
     * halved. Decay of this kind is expressed in function of the previous value by
     * functions like
     * f(x, t) = x * F ^ (t / L)
     * ...where x is the value, t is the elapsed time, L is the half-life (or more
     * generally the F-th-life) and F the decay factor (typically 0.5, hence why L is
     * usually called the half-life). The ^ symbol here is used for exponentiation.
     * Or, starting at a given M for t = 0 :
     * f(t) = M * F ^ (t / L)
     *
     * Because a line in the store needs to become irrelevant at some point but
     * this class of functions never go to 0, a minimum cutoff has to be chosen to
     * represent irrelevance. The simpler way of doing this is to simply add this
     * minimum cutoff to the computation before and removing it after.
     * Thus the function becomes :
     * f(x, t) = ((x + K) * F ^ (t / L)) - K
     * ...where K is the minimum cutoff, L the half-life, and F the factor between
     * the original x and x after its half-life. Strictly speaking using the word
     * "half-life" implies that F = 0.5, but the relation works for any value of F.
     *
     * It is easy enough to check that this function satisfies the stability
     * relation that was given above for any value of F, L and K, which become
     * parameters that can be defined at will.
     *
     * relevance
     *  1.0 |
     *      |\
     *      | \
     *      |  \            (this graph rendered with L = 75 days and K = 1/40)
     *  0.75|   ',
     *      |     \
     *      |      '.
     *      |        \.
     *      |          \
     *  0.5 |           '\
     *      |             ''.
     *      |                ''.
     *      |                   ''.
     *  0.25|                      '''..
     *      |                           '''..
     *      |                                ''''....
     *      |                                        '''''..........
     *    0 +-------------------------------------------------------''''''''''----
     *      0       50       100      150     200      250     300      350     400 days
     *
     *** Choosing the parameters
     * The maximum M is an arbitrary parameter that simply scales the curve.
     * The tradeoff for M is pretty simple : if the relevance is going to be an
     * integer, the bigger M is the more precision there is in the relevance.
     * However, values of M that are easy for humans to read are preferable to
     * help debugging, and a suitably low value may be enough to ensure there
     * won't be integer overflows in intermediate computations.
     * A value of 1_000_000 probably is plenty for precision, while still in the
     * low range of what ints can represent.
     *
     * F and L are parameters to be chosen arbitrarily and have an impact on how
     * fast the relevance will be decaying at first, keeping in mind that
     * the 400 days value and the cap stay the same. In simpler words, F and L
     * define the steepness of the curve.
     * To keep things simple (and familiar) F is arbitrarily chosen to be 0.5, and
     * L is set to 200 days visually to achieve the desired effect. Refer to the
     * illustration above to get a feel of how that feels.
     *
     * Moreover, the memory store works on an assumption that the relevance should
     * be capped, and that an entry with capped relevance should decay in 400 days.
     * This is on premises that the networks a device will need to remember the
     * longest should be networks visited about once a year.
     * For this reason, the relevance is at the maximum M 400 days before expiry :
     * f(M, 400 days) = 0
     * From replacing this with the value of the function, K can then be derived
     * from the values of M, F and L :
     * (M + K) * F ^ (t / L) - K = 0
     * K = M * F ^ (400 days / L) / (1 - F ^ (400 days / L))
     * Replacing with actual values this gives :
     * K = 1_000_000 * 0.5 ^ (400 / 200) / (1 - 0.5 ^ (400 / 200))
     *   = 1_000_000 / 3 ≈ 333_333.3
     * This ensures the function has the desired profile, the desired value at
     * cap, and the desired value at expiry.
     *
     *** Useful relations
     * Let's define the expiry time for any given relevance x as the interval of
     * time such as :
     * f(x, expiry) = 0
     * which can be rewritten
     * ((x + K) * F ^ (expiry / L)) = K
     * ...giving an expression of the expiry in function of the relevance x as
     * expiry = L * logF(K / (x + K))
     * Conversely the relevance x can be expressed in function of the expiry as
     * x = K / F ^ (expiry / L) - K
     * These relations are useful in utility functions.
     *
     *** Bumping things up
     * The last issue therefore is to decide how to bump up the relevance. The
     * simple approach is to simply lift up the curve a little bit by a constant
     * normalized amount, delaying the time of expiry. For example increasing
     * the relevance by an amount I gives :
     * x2 = x1 + I
     * x2 and x1 correspond to two different expiry times expiry2 and expiry1,
     * and replacing x1 and x2 in the relation above with their expression in
     * function of the expiry comes :
     * K / F ^ (expiry2 / L) - K = K / F ^ (expiry1 / L) - K + I
     * which resolves to :
     * expiry2 = L * logF(K / (I + K / F ^ (expiry1 / L)))
     *
     * In this implementation, the bump is defined as 1/25th of the cap for
     * the relevance. This means a network will be remembered for the maximum
     * period of 400 days if connected 25 times in succession not accounting
     * for decay. Of course decay actually happens so it will take more than 25
     * connections for any given network to actually reach the cap, but because
     * decay is slow at first, it is a good estimate of how fast cap happens.
     *
     * Specifically, it gives the following four results :
     * - A network that a device connects to once hits irrelevance about 32.7 days after
     *   it was first registered if never connected again.
     * - A network that a device connects to once a day at a fixed hour will hit the cap
     *   on the 27th connection.
     * - A network that a device connects to once a week at a fixed hour will hit the cap
     *   on the 57th connection.
     * - A network that a device connects to every day for 7 straight days then never again
     *   expires 144 days after the last connection.
     * These metrics tend to match pretty well the requirements.
     */

    // TODO : make these constants configurable at runtime. Don't forget to build it so that
    // changes will wipe the database, migrate the values, or otherwise make sure the relevance
    // values are still meaningful.

    // How long, in milliseconds, is a capped relevance valid for, or in other
    // words how many milliseconds after its relevance was set to RELEVANCE_CAP does
    // any given line expire. 400 days.
    @VisibleForTesting
    public static final long CAPPED_RELEVANCE_LIFETIME_MS = 400L * 24 * 60 * 60 * 1000;

    // The constant that represents a normalized 1.0 value for the relevance. In other words,
    // the cap for the relevance. This is referred to as M in the explanation above.
    @VisibleForTesting
    public static final int CAPPED_RELEVANCE = 1_000_000;

    // The decay factor. After a half-life, the relevance will have decayed by this value.
    // This is referred to as F in the explanation above.
    private static final double DECAY_FACTOR = 0.5;

    // The half-life. After this time, the relevance will have decayed by a factor DECAY_FACTOR.
    // This is referred to as L in the explanation above.
    private static final long HALF_LIFE_MS = 200L * 24 * 60 * 60 * 1000;

    // The value of the frame change. This is referred to as K in the explanation above.
    private static final double IRRELEVANCE_FLOOR =
            CAPPED_RELEVANCE * powF((double) CAPPED_RELEVANCE_LIFETIME_MS / HALF_LIFE_MS)
            / (1 - powF((double) CAPPED_RELEVANCE_LIFETIME_MS / HALF_LIFE_MS));

    // How much to bump the relevance by every time a line is written to.
    @VisibleForTesting
    public static final int RELEVANCE_BUMP = CAPPED_RELEVANCE / 25;

    // Java doesn't include a function for the logarithm in an arbitrary base, so implement it
    private static final double LOG_DECAY_FACTOR = Math.log(DECAY_FACTOR);
    private static double logF(final double value) {
        return Math.log(value) / LOG_DECAY_FACTOR;
    }

    // Utility function to get a power of the decay factor, to simplify the code.
    private static double powF(final double value) {
        return Math.pow(DECAY_FACTOR, value);
    }

    /**
     * Compute the value of the relevance now given an expiry date.
     *
     * @param expiry the date at which the column in the database expires.
     * @return the adjusted value of the relevance for this moment in time.
     */
    public static int computeRelevanceForNow(final long expiry) {
        return computeRelevanceForTargetDate(expiry, System.currentTimeMillis());
    }

    /**
     * Compute the value of the relevance at a given date from an expiry date.
     *
     * Because relevance decays with time, a relevance in the past corresponds to
     * a different relevance later.
     *
     * Relevance is always a positive value. 0 means not relevant at all.
     *
     * See the explanation at the top of this file to get the justification for this
     * computation.
     *
     * @param expiry the date at which the column in the database expires.
     * @param target the target date to adjust the relevance to.
     * @return the adjusted value of the relevance for the target moment.
     */
    public static int computeRelevanceForTargetDate(final long expiry, final long target) {
        final long delay = expiry - target;
        if (delay >= CAPPED_RELEVANCE_LIFETIME_MS) return CAPPED_RELEVANCE;
        if (delay <= 0) return 0;
        return (int) (IRRELEVANCE_FLOOR / powF((float) delay / HALF_LIFE_MS) - IRRELEVANCE_FLOOR);
    }

    /**
     * Compute the expiry duration adjusted up for a new fresh write.
     *
     * Every time data is written to the memory store for a given line, the
     * relevance is bumped up by a certain amount, which will boost the priority
     * of this line for computation of group attributes, and delay (possibly
     * indefinitely, if the line is accessed regularly) forgetting the data stored
     * in that line.
     * As opposed to bumpExpiryDate, this function uses a duration from now to expiry.
     *
     * See the explanation at the top of this file for a justification of this computation.
     *
     * @param oldExpiryDuration the old expiry duration in milliseconds from now.
     * @return the expiry duration representing a bumped up relevance value.
     */
    public static long bumpExpiryDuration(final long oldExpiryDuration) {
        // L * logF(K / (I + K / F ^ (expiry1 / L))), as documented above
        final double divisionFactor = powF(((double) oldExpiryDuration) / HALF_LIFE_MS);
        final double oldRelevance = IRRELEVANCE_FLOOR / divisionFactor;
        final long newDuration =
                (long) (HALF_LIFE_MS * logF(IRRELEVANCE_FLOOR / (RELEVANCE_BUMP + oldRelevance)));
        return Math.min(newDuration, CAPPED_RELEVANCE_LIFETIME_MS);
    }

    /**
     * Compute the new expiry date adjusted up for a new fresh write.
     *
     * Every time data is written to the memory store for a given line, the
     * relevance is bumped up by a certain amount, which will boost the priority
     * of this line for computation of group attributes, and delay (possibly
     * indefinitely, if the line is accessed regularly) forgetting the data stored
     * in that line.
     * As opposed to bumpExpiryDuration, this function takes the old timestamp and returns the
     * new timestamp.
     *
     * {@see bumpExpiryDuration}, and keep in mind that the bump depends on when this is called,
     * because the relevance decays exponentially, therefore bumping up a high relevance (for a
     * date far in the future) is less potent than bumping up a low relevance (for a date in
     * a close future).
     *
     * @param oldExpiryDate the old date of expiration.
     * @return the new expiration date after the relevance bump.
     */
    public static long bumpExpiryDate(final long oldExpiryDate) {
        final long now = System.currentTimeMillis();
        final long newDuration = bumpExpiryDuration(oldExpiryDate - now);
        return now + newDuration;
    }
}
+149 −0
Original line number Diff line number Diff line
/*
 * Copyright (C) 2018 The Android Open Source Project
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package com.android.server.net.ipmemorystore;

import static com.android.server.net.ipmemorystore.RelevanceUtils.CAPPED_RELEVANCE;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assert.fail;

import android.support.test.filters.SmallTest;
import android.support.test.runner.AndroidJUnit4;

import org.junit.Test;
import org.junit.runner.RunWith;

/** Unit tests for {@link RelevanceUtils}. */
@SmallTest
@RunWith(AndroidJUnit4.class)
public class RelevanceUtilsTests {
    @Test
    public void testComputeRelevanceForTargetDate() {
        final long dayInMillis = 24L * 60 * 60 * 1000;
        final long base = 1_000_000L; // any given point in time
        // Relevance when the network expires in 1000 years must be capped
        assertEquals(CAPPED_RELEVANCE, RelevanceUtils.computeRelevanceForTargetDate(
                base + 1000L * dayInMillis, base));
        // Relevance when expiry is before the date must be 0
        assertEquals(0, RelevanceUtils.computeRelevanceForTargetDate(base - 1, base));
        // Make sure the relevance for a given target date is higher if the expiry is further
        // in the future
        assertTrue(RelevanceUtils.computeRelevanceForTargetDate(base + 100 * dayInMillis, base)
                < RelevanceUtils.computeRelevanceForTargetDate(base + 150 * dayInMillis, base));

        // Make sure the relevance falls slower as the expiry is closing in. This is to ensure
        // the decay is indeed logarithmic.
        final int relevanceAtExpiry = RelevanceUtils.computeRelevanceForTargetDate(base, base);
        final int relevance50DaysBeforeExpiry =
                RelevanceUtils.computeRelevanceForTargetDate(base + 50 * dayInMillis, base);
        final int relevance100DaysBeforeExpiry =
                RelevanceUtils.computeRelevanceForTargetDate(base + 100 * dayInMillis, base);
        final int relevance150DaysBeforeExpiry =
                RelevanceUtils.computeRelevanceForTargetDate(base + 150 * dayInMillis, base);
        assertEquals(0, relevanceAtExpiry);
        assertTrue(relevance50DaysBeforeExpiry - relevanceAtExpiry
                < relevance100DaysBeforeExpiry - relevance50DaysBeforeExpiry);
        assertTrue(relevance100DaysBeforeExpiry - relevance50DaysBeforeExpiry
                < relevance150DaysBeforeExpiry - relevance100DaysBeforeExpiry);
    }

    @Test
    public void testIncreaseRelevance() {
        long expiry = System.currentTimeMillis();

        final long firstBump = RelevanceUtils.bumpExpiryDate(expiry);
        // Though a few milliseconds might have elapsed, the first bump should push the duration
        // to days in the future, so unless this test takes literal days between these two lines,
        // this should always pass.
        assertTrue(firstBump > expiry);

        expiry = 0;
        long lastDifference = Long.MAX_VALUE;
        // The relevance should be capped in at most this many steps. Otherwise, fail.
        final int steps = 1000;
        for (int i = 0; i < steps; ++i) {
            final long newExpiry = RelevanceUtils.bumpExpiryDuration(expiry);
            if (newExpiry == expiry) {
                // The relevance should be capped. Make sure it is, then exit without failure.
                assertEquals(newExpiry, RelevanceUtils.CAPPED_RELEVANCE_LIFETIME_MS);
                return;
            }
            // Make sure the new expiry is further in the future than last time.
            assertTrue(newExpiry > expiry);
            // Also check that it was not bumped as much as the last bump, because the
            // decay must be exponential.
            assertTrue(newExpiry - expiry < lastDifference);
            lastDifference = newExpiry - expiry;
            expiry = newExpiry;
        }
        fail("Relevance failed to go to the maximum value after " + steps + " bumps");
    }

    @Test
    public void testContinuity() {
        final long expiry = System.currentTimeMillis();

        // Relevance at expiry and after expiry should be the cap.
        final int relevanceBeforeMaxLifetime = RelevanceUtils.computeRelevanceForTargetDate(expiry,
                expiry - (RelevanceUtils.CAPPED_RELEVANCE_LIFETIME_MS + 1_000_000));
        assertEquals(relevanceBeforeMaxLifetime, CAPPED_RELEVANCE);
        final int relevanceForMaxLifetime = RelevanceUtils.computeRelevanceForTargetDate(expiry,
                expiry - RelevanceUtils.CAPPED_RELEVANCE_LIFETIME_MS);
        assertEquals(relevanceForMaxLifetime, CAPPED_RELEVANCE);

        // If the max relevance is reached at the cap lifetime, one millisecond less than this
        // should be very close. Strictly speaking this is a bit brittle, but it should be
        // good enough for the purposes of the memory store.
        final int relevanceForOneMillisecLessThanCap = RelevanceUtils.computeRelevanceForTargetDate(
                expiry, expiry - RelevanceUtils.CAPPED_RELEVANCE_LIFETIME_MS + 1);
        assertTrue(relevanceForOneMillisecLessThanCap <= CAPPED_RELEVANCE);
        assertTrue(relevanceForOneMillisecLessThanCap >= CAPPED_RELEVANCE - 10);

        // Likewise the relevance one millisecond before expiry should be very close to 0. It's
        // fine if it rounds down to 0.
        final int relevanceOneMillisecBeforeExpiry = RelevanceUtils.computeRelevanceForTargetDate(
                expiry, expiry - 1);
        assertTrue(relevanceOneMillisecBeforeExpiry <= 10);
        assertTrue(relevanceOneMillisecBeforeExpiry >= 0);

        final int relevanceAtExpiry = RelevanceUtils.computeRelevanceForTargetDate(expiry, expiry);
        assertEquals(relevanceAtExpiry, 0);
        final int relevanceAfterExpiry = RelevanceUtils.computeRelevanceForTargetDate(expiry,
                expiry + 1_000_000);
        assertEquals(relevanceAfterExpiry, 0);
    }

    // testIncreaseRelevance makes sure bumping the expiry continuously always yields a
    // monotonically increasing date as a side effect, but this tests that the relevance (as
    // opposed to the expiry date) increases monotonically with increasing periods.
    @Test
    public void testMonotonicity() {
        // Hopefully the relevance is granular enough to give a different value for every one
        // of this number of steps.
        final int steps = 40;
        final long expiry = System.currentTimeMillis();

        int lastRelevance = -1;
        for (int i = 0; i < steps; ++i) {
            final long date = expiry - i * (RelevanceUtils.CAPPED_RELEVANCE_LIFETIME_MS / steps);
            final int relevance = RelevanceUtils.computeRelevanceForTargetDate(expiry, date);
            assertTrue(relevance > lastRelevance);
            lastRelevance = relevance;
        }
    }
}