Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit d7e02c08 authored by Bjorn Helgaas's avatar Bjorn Helgaas
Browse files

Merge branch 'pci/aer'

  - unify AER decoding for native and ACPI CPER sources (Alexandru Gagniuc)

  - add TLP header info to AER tracepoint (Thomas Tai)

  - add generic pcie_wait_for_link() interface (Oza Pawandeep)

  - handle AER ERR_FATAL by removing and re-enumerating devices, as
    Downstream Port Containment does (Oza Pawandeep)

  - factor out common code between AER and DPC recovery (Oza Pawandeep)

  - stop triggering DPC for ERR_NONFATAL errors (Oza Pawandeep)

  - share ERR_FATAL recovery path between AER and DPC (Oza Pawandeep)

* pci/aer:
  PCI/AER: Replace struct pcie_device with pci_dev
  PCI/AER: Remove unused parameters
  PCI/AER: Decode Error Source Requester ID
  PCI/AER: Remove aer_recover_work_func() forward declaration
  PCI/DPC: Use the generic pcie_do_fatal_recovery() path
  PCI/AER: Pass service type to pcie_do_fatal_recovery()
  PCI/DPC: Disable ERR_NONFATAL handling by DPC
  PCI/portdrv: Add generic pcie_port_find_device()
  PCI/portdrv: Add generic pcie_port_find_service()
  PCI/AER: Factor out error reporting to drivers/pci/pcie/err.c
  PCI/AER: Rename error recovery interfaces to generic PCI naming
  PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices
  PCI: Add generic pcie_wait_for_link() interface
  PCI/AER: Add TLP header information to tracepoint
  PCI/AER: Unify error bit printing for native and CPER reporting
parents 60cc43fc e13d17f7
Loading
Loading
Loading
Loading
+25 −10
Original line number Diff line number Diff line
@@ -110,7 +110,7 @@ The actual steps taken by a platform to recover from a PCI error
event will be platform-dependent, but will follow the general
sequence described below.

STEP 0: Error Event
STEP 0: Error Event: ERR_NONFATAL
-------------------
A PCI bus error is detected by the PCI hardware.  On powerpc, the slot
is isolated, in that all I/O is blocked: all reads return 0xffffffff,
@@ -228,13 +228,7 @@ proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
proceeds to STEP 4 (Slot Reset)

STEP 3: Link Reset
------------------
The platform resets the link.  This is a PCI-Express specific step
and is done whenever a fatal error has been detected that can be
"solved" by resetting the link.

STEP 4: Slot Reset
STEP 3: Slot Reset
------------------

In response to a return value of PCI_ERS_RESULT_NEED_RESET, the
@@ -320,7 +314,7 @@ Failure).
>>> However, it probably should.


STEP 5: Resume Operations
STEP 4: Resume Operations
-------------------------
The platform will call the resume() callback on all affected device
drivers if all drivers on the segment have returned
@@ -332,7 +326,7 @@ a result code.
At this point, if a new error happens, the platform will restart
a new error recovery sequence.

STEP 6: Permanent Failure
STEP 5: Permanent Failure
-------------------------
A "permanent failure" has occurred, and the platform cannot recover
the device.  The platform will call error_detected() with a
@@ -355,6 +349,27 @@ errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
for additional detail on real-life experience of the causes of
software errors.

STEP 0: Error Event: ERR_FATAL
-------------------
PCI bus error is detected by the PCI hardware. On powerpc, the slot is
isolated, in that all I/O is blocked: all reads return 0xffffffff, all
writes are ignored.

STEP 1: Remove devices
--------------------
Platform removes the devices depending on the error agent, it could be
this port for all subordinates or upstream component (likely downstream
port)

STEP 2: Reset link
--------------------
The platform resets the link.  This is a PCI-Express specific step and is
done whenever a fatal error has been detected that can be "solved" by
resetting the link.

STEP 3: Re-enumerate the devices
--------------------
Initiates the re-enumeration.

Conclusion; General Remarks
---------------------------
+3 −17
Original line number Diff line number Diff line
@@ -231,25 +231,11 @@ bool pciehp_check_link_active(struct controller *ctrl)
	return ret;
}

static void __pcie_wait_link_active(struct controller *ctrl, bool active)
{
	int timeout = 1000;

	if (pciehp_check_link_active(ctrl) == active)
		return;
	while (timeout > 0) {
		msleep(10);
		timeout -= 10;
		if (pciehp_check_link_active(ctrl) == active)
			return;
	}
	ctrl_dbg(ctrl, "Data Link Layer Link Active not %s in 1000 msec\n",
			active ? "set" : "cleared");
}

static void pcie_wait_link_active(struct controller *ctrl)
{
	__pcie_wait_link_active(ctrl, true);
	struct pci_dev *pdev = ctrl_dev(ctrl);

	pcie_wait_for_link(pdev, true);
}

static bool pci_bus_check_dev(struct pci_bus *bus, int devfn)
+1 −1
Original line number Diff line number Diff line
@@ -1535,7 +1535,7 @@ static int pci_uevent(struct device *dev, struct kobj_uevent_env *env)
	return 0;
}

#if defined(CONFIG_PCIEAER) || defined(CONFIG_EEH)
#if defined(CONFIG_PCIEPORTBUS) || defined(CONFIG_EEH)
/**
 * pci_uevent_ers - emit a uevent during recovery path of PCI device
 * @pdev: PCI device undergoing error recovery
+29 −0
Original line number Diff line number Diff line
@@ -4138,6 +4138,35 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)

	return pci_dev_wait(dev, "PM D3->D0", PCIE_RESET_READY_POLL_MS);
}
/**
 * pcie_wait_for_link - Wait until link is active or inactive
 * @pdev: Bridge device
 * @active: waiting for active or inactive?
 *
 * Use this to wait till link becomes active or inactive.
 */
bool pcie_wait_for_link(struct pci_dev *pdev, bool active)
{
	int timeout = 1000;
	bool ret;
	u16 lnk_status;

	for (;;) {
		pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
		ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
		if (ret == active)
			return true;
		if (timeout <= 0)
			break;
		msleep(10);
		timeout -= 10;
	}

	pci_info(pdev, "Data Link Layer Link Active not %s in 1000 msec\n",
		 active ? "set" : "cleared");

	return false;
}

void pci_reset_secondary_bus(struct pci_dev *dev)
{
+5 −0
Original line number Diff line number Diff line
@@ -353,6 +353,11 @@ static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,

void pci_enable_acs(struct pci_dev *dev);

/* PCI error reporting and recovery */
void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service);
void pcie_do_nonfatal_recovery(struct pci_dev *dev);

bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
#ifdef CONFIG_PCIEASPM
void pcie_aspm_init_link_state(struct pci_dev *pdev);
void pcie_aspm_exit_link_state(struct pci_dev *pdev);
Loading