Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 12a93545 authored by Xiubo Li's avatar Xiubo Li Committed by Greg Kroah-Hartman
Browse files

ceph: do not update snapshot context when there is no new snapshot

[ Upstream commit 2e586641c950e7f3e7e008404bd783a466b9b590 ]

We will only track the uppest parent snapshot realm from which we
need to rebuild the snapshot contexts _downward_ in hierarchy. For
all the others having no new snapshot we will do nothing.

This fix will avoid calling ceph_queue_cap_snap() on some inodes
inappropriately. For example, with the code in mainline, suppose there
are 2 directory hierarchies (with 6 directories total), like this:

/dir_X1/dir_X2/dir_X3/
/dir_Y1/dir_Y2/dir_Y3/

Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
root snapshot under /.snap/root_snap. Every time we make snapshots under
/dir_Y1/..., the kclient will always try to rebuild the snap context for
snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
and dir_Y3, which makes no sense.

That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
creating a new snapshot under /dir_Y1/... the new seq will be 4, and
the mds will send the kclient a snapshot backtrace in _downward_
order: seqs 4, 3.

When ceph_update_snap_trace() is called, it will always rebuild the from
the last realm, that's the root_snap. So later when rebuilding the snap
context, the current logic will always cause it to rebuild the snap_X2
realm and then try to queue cap snaps for all the inodes related in that
realm, even though it's not necessary.

This is accompanied by a lot of these sorts of dout messages:

    "ceph:  queue_cap_snap 00000000a42b796b nothing dirty|writing"

Fix the logic to avoid this situation.

Also, the 'invalidate' word is not precise here. In actuality, it will
cause a rebuild of the existing snapshot contexts or just build
non-existent ones. Rename it to 'rebuild_snapcs'.

URL: https://tracker.ceph.com/issues/44100


Signed-off-by: default avatarXiubo Li <xiubli@redhat.com>
Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
Stable-dep-of: 51884d153f7e ("ceph: avoid putting the realm twice when decoding snaps fails")
Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
parent 0528b19d
Loading
Loading
Loading
Loading
+19 −9
Original line number Diff line number Diff line
@@ -696,7 +696,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
	__le64 *prior_parent_snaps;        /* encoded */
	struct ceph_snap_realm *realm = NULL;
	struct ceph_snap_realm *first_realm = NULL;
	int invalidate = 0;
	struct ceph_snap_realm *realm_to_rebuild = NULL;
	int rebuild_snapcs;
	int err = -ENOMEM;
	LIST_HEAD(dirty_realms);

@@ -704,6 +705,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,

	dout("update_snap_trace deletion=%d\n", deletion);
more:
	rebuild_snapcs = 0;
	ceph_decode_need(&p, e, sizeof(*ri), bad);
	ri = p;
	p += sizeof(*ri);
@@ -727,7 +729,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
	err = adjust_snap_realm_parent(mdsc, realm, le64_to_cpu(ri->parent));
	if (err < 0)
		goto fail;
	invalidate += err;
	rebuild_snapcs += err;

	if (le64_to_cpu(ri->seq) > realm->seq) {
		dout("update_snap_trace updating %llx %p %lld -> %lld\n",
@@ -752,22 +754,30 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
		if (realm->seq > mdsc->last_snap_seq)
			mdsc->last_snap_seq = realm->seq;

		invalidate = 1;
		rebuild_snapcs = 1;
	} else if (!realm->cached_context) {
		dout("update_snap_trace %llx %p seq %lld new\n",
		     realm->ino, realm, realm->seq);
		invalidate = 1;
		rebuild_snapcs = 1;
	} else {
		dout("update_snap_trace %llx %p seq %lld unchanged\n",
		     realm->ino, realm, realm->seq);
	}

	dout("done with %llx %p, invalidated=%d, %p %p\n", realm->ino,
	     realm, invalidate, p, e);
	dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino,
	     realm, rebuild_snapcs, p, e);

	/* invalidate when we reach the _end_ (root) of the trace */
	if (invalidate && p >= e)
		rebuild_snap_realms(realm, &dirty_realms);
	/*
	 * this will always track the uppest parent realm from which
	 * we need to rebuild the snapshot contexts _downward_ in
	 * hierarchy.
	 */
	if (rebuild_snapcs)
		realm_to_rebuild = realm;

	/* rebuild_snapcs when we reach the _end_ (root) of the trace */
	if (realm_to_rebuild && p >= e)
		rebuild_snap_realms(realm_to_rebuild, &dirty_realms);

	if (!first_realm)
		first_realm = realm;