[Linux-kernel-mentees] [PATCH v2] PCI: Add support for LTR

Puranjay Mohan puranjay12 at gmail.com
Thu Sep 17 16:36:43 UTC 2020


On Wed, Sep 16, 2020 at 3:31 AM Bjorn Helgaas <helgaas at kernel.org> wrote:
>
> On Tue, Aug 25, 2020 at 11:31:31PM +0530, Puranjay Mohan wrote:
> > Add a new function pci_ltr_init() which will be called from
> > pci_init_capabilities() to initialize every PCIe device's LTR values.
> > Add code in probe.c to evaluate LTR _DSM and save the latencies in pci_dev.
> >
> > Signed-off-by: Puranjay Mohan <puranjay12 at gmail.com>
> > ---
> > v2 - add an #ifdefin pci-acpi.c to fix the bug reported by test bot.
> > v1 - based the patch on v5.9-rc1 so it applies correctly.
> > ---
> >  drivers/pci/pci-acpi.c   | 30 ++++++++++++++++++++++++++++++
> >  drivers/pci/pci.c        | 27 +++++++++++++++++++++++++++
> >  drivers/pci/pci.h        |  5 +++++
> >  drivers/pci/probe.c      |  6 ++++++
> >  include/linux/pci-acpi.h |  1 +
> >  include/linux/pci.h      |  2 ++
> >  6 files changed, 71 insertions(+)
> >
> > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > index d5869a03f748..af8297040c38 100644
> > --- a/drivers/pci/pci-acpi.c
> > +++ b/drivers/pci/pci-acpi.c
> > @@ -1213,6 +1213,36 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
> >       ACPI_FREE(obj);
> >  }
> >
> > +/* pci_acpi_evaluate_ltr_latency
> > + *
> > + * @dev - the pci_dev to evaluate and save latencies
> > + */
> > +void pci_acpi_evaluate_ltr_latency(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_PCIASPM
> > +     union acpi_object *obj, *elements;
> > +     struct acpi_device *handle;
>
> Nit: sort declarations in order of use: handle, obj, etc.
>
> > +
> > +     handle = ACPI_HANDLE(&dev->dev);
> > +     if (!handle)
> > +             return;
> > +
> > +     obj = acpi_evaluate_dsm(handle, &pci_acpi_dsm_guid, 0x2,
> > +                             DSM_PCI_LTR_MAX_LATENCY, NULL);
> > +     if (!obj)
> > +             return;
>
> Sorry for the delay in responding.  Coincidentally, part of the reason
> is that I've been consumed with some PCI Firmware spec updates related
> to _DSM, which actually happens to affect this code.
>
> _DSM is defined in the public ACPI spec; see ACPI v6.3, sec 9.1.1.
> One problem with this interface is that it does not define any
> standard return type or value, so there is no way to tell from the
> return value whether the function is implemented.  That means I don't
> think it's safe to rely on "!obj" meaning "DSM_PCI_LTR_MAX_LATENCY
> isn't implemented".  Similarly, I don't think it's safe to assume
> "obj" means it *is* implemented.
>
> We have lots of places in the kernel that do this wrong.  One that
> does it *right* is acpi_dpc_port_get().  Short story: we need to call
> acpi_check_dsm() first before calling acpi_evaluate_dsm().
Okay, I will add this call.

>
> There's an ongoing discussion about what *revision* we should use.
> For now I think the right thing is to use 2 as you did, since that's
> the first revision where DSM_PCI_LTR_MAX_LATENCY is defined.  I would
> use "2" (decimal) like the other uses in the file, though.
I will change 0x2 to 2.

>
> > +     if (obj->type == ACPI_TYPE_PACKAGE && obj->package.count == 4) {
> > +             elements = obj->package.elements;
> > +             dev->max_snoop_latency = (u16)elements[1].integer.value |
> > +                             ((u16)elements[0].integer.value << PCI_LTR_SCALE_SHIFT);
> > +             dev->max_nosnoop_latency = (u16)elements[3].integer.value |
> > +                             ((u16)elements[2].integer.value << PCI_LTR_SCALE_SHIFT);
> > +     }
> > +     ACPI_FREE(obj);
> > +#endif
> > +}
> > +
> >  static void pci_acpi_set_external_facing(struct pci_dev *dev)
> >  {
> >       u8 val;
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index a458c46d7e39..b5531272b865 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3056,6 +3056,33 @@ void pci_pm_init(struct pci_dev *dev)
> >               dev->imm_ready = 1;
> >  }
> >
> > +/**
> > + * pci_ltr_init - Initialize Latency Tolerance Information of given PCI device
> > + * @dev: PCI device to handle.
> > + */
> > +void pci_ltr_init(struct pci_dev *dev)
> > +{
> > +#ifdef CONFIG_PCIASPM
> > +     int ltr;
> > +     struct pci_dev *endpoint_dev = dev;
>
> Nit: sort the declarations in order of use, i.e., endpoint_dev, ltr, ...
>
> > +     u16 max_snoop_sum = 0;
> > +     u16 max_nosnoop_sum = 0;
> > +
> > +     ltr = pci_find_ext_capability(endpoint_dev, PCI_EXT_CAP_ID_LTR);
> > +     if (!ltr)
> > +             return;
> > +
> > +     dev = pci_upstream_bridge(dev);
> > +     while (dev) {
> > +             max_snoop_sum += dev->max_snoop_latency;
> > +             max_nosnoop_sum += dev->max_nosnoop_latency;
>
> dev->max_snoop_latency and dev->max_nosnoop_latency are not simple
> scalars, are they?  Aren't they 3 bits of scale and 10 bits of value?
> I don't think adding these is as easy as "+=" except in the simple
> case when the scale is "000", i.e., "use the 10 bits of value as-is".
>
> I think we have to decode scale * latency for each device in the path,
> add all those up, then re-encode using the appropriate scale for the
> config write below.
I was thinking about it. If we use 2 more variables and store the
scale and value separately, then It will become easy.
we can add the values directly but, as you said we can't add the
scales, I will think about this more.

>
> > +             dev = pci_upstream_bridge(dev);
> > +     }
> > +     pci_write_config_word(endpoint_dev, ltr + PCI_LTR_MAX_SNOOP_LAT, max_snoop_sum);
> > +     pci_write_config_word(endpoint_dev, ltr + PCI_LTR_MAX_NOSNOOP_LAT, max_nosnoop_sum);
>
> I think we definitely need to do this, but we need to be careful about
> updating things here, since I suspect mistakes will cause
> hard-to-debug problems.  PCIe r5.0, sec 6.18, says:
>
>   Setting the value and scale fields to all 0’s indicates that the
>   device will be impacted by any delay and that the best possible
>   service is requested.
>
> If I understand that correctly, larger values for these registers
> indicate that the device can tolerate more latency, so we want to err
> on the side of setting these fields too low rather than too high.
>
> So I think maybe we should read the current values first.  If the
> value we computed is the same, we can just skip the write.
>
> What should we do if the value we computed is *larger* than the
> current value?  Hot-added devices will power up with zeroes here, so I
> think we probably should log a note (pci_info()) and do the write.  If
> we don't do the write and we enable ASPM, LTR, and L1 Substates, the
> zeroes mean the device will be requesting the best possible service,
> so we won't use the L1.2 substate as much as we should, resulting in
> more power consumption than necessary.
>
> If we read something non-zero, it probably means firmware has already
> configured this.  If we computed something larger, we should probably
> still log a message, but maybe skip the write.  My thinking is that
> this may be a firmware bug, and fixing it could reduce power usage.
> If we do the write, we risk breaking something that used to work.
>
> If the value we computed is *smaller* than the current value, I think
> we should log a note just to show that we're changing something, and
> we *should* do the write, because writing a smaller value should
> always be safe.
>
> The log messages should include both the current value and the new
> value we computed (decoded so they're easy to read and compare).
>
I will add code to do all this functionality.

> > +#endif
> > +}
> > +
> >  static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> >  {
> >       unsigned long flags = IORESOURCE_PCI_FIXED | IORESOURCE_PCI_EA_BEI;
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index fa12f7cbc1a0..ef3d22b82200 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -110,6 +110,7 @@ void pci_free_cap_save_buffers(struct pci_dev *dev);
> >  bool pci_bridge_d3_possible(struct pci_dev *dev);
> >  void pci_bridge_d3_update(struct pci_dev *dev);
> >  void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
> > +void pci_ltr_init(struct pci_dev *dev);
> >
> >  static inline void pci_wakeup_event(struct pci_dev *dev)
> >  {
> > @@ -680,11 +681,15 @@ static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL
> >
> >  #ifdef CONFIG_ACPI
> >  int pci_acpi_program_hp_params(struct pci_dev *dev);
> > +void pci_acpi_evaluate_ltr_latency(struct pci_dev *dev);
> >  #else
> >  static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
> >  {
> >       return -ENODEV;
> >  }
> > +static inline void pci_acpi_evaluate_ltr_latency(struct pci_dev *dev)
> > +{
> > +}
>
> Nit: all on one line as in rest of this file:
>
>   static inline void pci_acpi_evaluate_ltr_latency(struct pci_dev *dev) { }
>
> >  #endif
> >
> >  #ifdef CONFIG_PCIEASPM
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 03d37128a24f..0257aa615665 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -2140,6 +2140,11 @@ static void pci_configure_ltr(struct pci_dev *dev)
> >               dev->ltr_path = 1;
> >       }
> >  #endif
> > +
> > +     /*
> > +      * Read latency values from _DSM and save in pci_dev
> > +      */
> > +     pci_acpi_evaluate_ltr_latency(dev);
> >  }
> >
> >  static void pci_configure_eetlp_prefix(struct pci_dev *dev)
> > @@ -2400,6 +2405,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
> >       pci_ptm_init(dev);              /* Precision Time Measurement */
> >       pci_aer_init(dev);              /* Advanced Error Reporting */
> >       pci_dpc_init(dev);              /* Downstream Port Containment */
> > +     pci_ltr_init(dev);              /* Latency Tolerance Reporting */
> >
> >       pcie_report_downtraining(dev);
> >
> > diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> > index 5ba475ca9078..e23236a4ff66 100644
> > --- a/include/linux/pci-acpi.h
> > +++ b/include/linux/pci-acpi.h
> > @@ -110,6 +110,7 @@ extern const guid_t pci_acpi_dsm_guid;
> >
> >  /* _DSM Definitions for PCI */
> >  #define DSM_PCI_PRESERVE_BOOT_CONFIG         0x05
> > +#define DSM_PCI_LTR_MAX_LATENCY                      0x06
> >  #define DSM_PCI_DEVICE_NAME                  0x07
> >  #define DSM_PCI_POWER_ON_RESET_DELAY         0x08
> >  #define DSM_PCI_DEVICE_READINESS_DURATIONS   0x09
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 835530605c0d..9de6b290ed81 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -380,6 +380,8 @@ struct pci_dev {
> >       struct pcie_link_state  *link_state;    /* ASPM link state */
> >       unsigned int    ltr_path:1;     /* Latency Tolerance Reporting
> >                                          supported from root to here */
> > +     u16 max_snoop_latency;          /* LTR Max Snoop latency */
> > +     u16 max_nosnoop_latency;        /* LTR Max No Snoop latency */
>
> Nit: "Max No-Snoop" to match spec usage.
>
> >  #endif
> >       unsigned int    eetlp_prefix_path:1;    /* End-to-End TLP Prefix */
> >
> > --
> > 2.27.0
> >



-- 
Thanks and Regards

Yours Truly,

Puranjay Mohan


More information about the Linux-kernel-mentees mailing list