Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 3fece5d6 authored by Mohamad Haj Yahia's avatar Mohamad Haj Yahia Committed by Saeed Mahameed
Browse files

net/mlx5: Continue health polling until it is explicitly stopped



The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.

Fixes: fd76ee4d ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: default avatarMohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: default avatarDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
parent 57f35c93
Loading
Loading
Loading
Loading
+5 −6
Original line number Diff line number Diff line
@@ -275,10 +275,8 @@ static void poll_health(unsigned long data)
	struct mlx5_core_health *health = &dev->priv.health;
	u32 count;

	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
		mod_timer(&health->timer, get_next_poll_jiffies());
		return;
	}
	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
		goto out;

	count = ioread32be(health->health_counter);
	if (count == health->prev)
@@ -290,8 +288,6 @@ static void poll_health(unsigned long data)
	if (health->miss_counter == MAX_MISSES) {
		dev_err(&dev->pdev->dev, "device's health compromised - reached miss count\n");
		print_health_info(dev);
	} else {
		mod_timer(&health->timer, get_next_poll_jiffies());
	}

	if (in_fatal(dev) && !health->sick) {
@@ -305,6 +301,9 @@ static void poll_health(unsigned long data)
				"new health works are not permitted at this stage\n");
		spin_unlock(&health->wq_lock);
	}

out:
	mod_timer(&health->timer, get_next_poll_jiffies());
}

void mlx5_start_health_poll(struct mlx5_core_dev *dev)