Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 70b565bb authored by Vaibhav Jain's avatar Vaibhav Jain Committed by Michael Ellerman
Browse files

cxl: Prevent adapter reset if an active context exists



This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Fixes: 62fa19d4 ("cxl: Add ability to reset the card")
Cc: stable@vger.kernel.org # v4.0+
Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
parent 65bc3ece
Loading
Loading
Loading
Loading
+5 −2
Original line number Diff line number Diff line
@@ -220,8 +220,11 @@ What: /sys/class/cxl/<card>/reset
Date:           October 2014
Contact:        linuxppc-dev@lists.ozlabs.org
Description:    write only
                Writing 1 will issue a PERST to card which may cause the card
                to reload the FPGA depending on load_image_on_perst.
                Writing 1 will issue a PERST to card provided there are no
                contexts active on any one of the card AFUs. This may cause
                the card to reload the FPGA depending on load_image_on_perst.
                Writing -1 will do a force PERST irrespective of any active
                contexts on the card AFUs.
Users:		https://github.com/ibm-capi/libcxl

What:		/sys/class/cxl/<card>/perst_reloads_same_image (not in a guest)
+9 −0
Original line number Diff line number Diff line
@@ -229,6 +229,14 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,
	if (ctx->status == STARTED)
		goto out; /* already started */

	/*
	 * Increment the mapped context count for adapter. This also checks
	 * if adapter_context_lock is taken.
	 */
	rc = cxl_adapter_context_get(ctx->afu->adapter);
	if (rc)
		goto out;

	if (task) {
		ctx->pid = get_task_pid(task, PIDTYPE_PID);
		ctx->glpid = get_task_pid(task->group_leader, PIDTYPE_PID);
@@ -240,6 +248,7 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,

	if ((rc = cxl_ops->attach_process(ctx, kernel, wed, 0))) {
		put_pid(ctx->pid);
		cxl_adapter_context_put(ctx->afu->adapter);
		cxl_ctx_put();
		goto out;
	}
+3 −0
Original line number Diff line number Diff line
@@ -238,6 +238,9 @@ int __detach_context(struct cxl_context *ctx)
	put_pid(ctx->glpid);

	cxl_ctx_put();

	/* Decrease the attached context count on the adapter */
	cxl_adapter_context_put(ctx->afu->adapter);
	return 0;
}

+24 −0
Original line number Diff line number Diff line
@@ -618,6 +618,14 @@ struct cxl {
	bool perst_select_user;
	bool perst_same_image;
	bool psl_timebase_synced;

	/*
	 * number of contexts mapped on to this card. Possible values are:
	 * >0: Number of contexts mapped and new one can be mapped.
	 *  0: No active contexts and new ones can be mapped.
	 * -1: No contexts mapped and new ones cannot be mapped.
	 */
	atomic_t contexts_num;
};

int cxl_pci_alloc_one_irq(struct cxl *adapter);
@@ -944,4 +952,20 @@ bool cxl_pci_is_vphb_device(struct pci_dev *dev);

/* decode AFU error bits in the PSL register PSL_SERR_An */
void cxl_afu_decode_psl_serr(struct cxl_afu *afu, u64 serr);

/*
 * Increments the number of attached contexts on an adapter.
 * In case an adapter_context_lock is taken the return -EBUSY.
 */
int cxl_adapter_context_get(struct cxl *adapter);

/* Decrements the number of attached contexts on an adapter */
void cxl_adapter_context_put(struct cxl *adapter);

/* If no active contexts then prevents contexts from being attached */
int cxl_adapter_context_lock(struct cxl *adapter);

/* Unlock the contexts-lock if taken. Warn and force unlock otherwise */
void cxl_adapter_context_unlock(struct cxl *adapter);

#endif
+11 −0
Original line number Diff line number Diff line
@@ -205,11 +205,22 @@ static long afu_ioctl_start_work(struct cxl_context *ctx,
	ctx->pid = get_task_pid(current, PIDTYPE_PID);
	ctx->glpid = get_task_pid(current->group_leader, PIDTYPE_PID);

	/*
	 * Increment the mapped context count for adapter. This also checks
	 * if adapter_context_lock is taken.
	 */
	rc = cxl_adapter_context_get(ctx->afu->adapter);
	if (rc) {
		afu_release_irqs(ctx, ctx);
		goto out;
	}

	trace_cxl_attach(ctx, work.work_element_descriptor, work.num_interrupts, amr);

	if ((rc = cxl_ops->attach_process(ctx, false, work.work_element_descriptor,
							amr))) {
		afu_release_irqs(ctx, ctx);
		cxl_adapter_context_put(ctx->afu->adapter);
		goto out;
	}

Loading