Pci bus error linux

Ошибки PCIe Bus Error: severity=Corrected при выключении в Linux

После обновления Linux, при следующем выключении компьютера на экран повалилась куча одинаковых сообщений об ошибке: «pcieport 0000:00:1c.3: PCIe Bus Error: severity=Corrected, type=Physical Layer«. Причём длится это безобразие довольно долго, так что рука тянется принудительно выключить компьютер с кнопки, что я пару раз и проделал, пока не нашёл решения проблемы.

Как я уже сказал выше, выключение или перезагрузка компьютера стала адовой процедурой — лог повторяющихся ошибок бежал по экрану в течении (!) двух минут. Оказалось, что данная проблема постигла практически всех владельцев интеловских процессоров и, по всей видимости, связана с попытками понизить питание на порту PCIe. На тематических Linux-форумах предлагается добавить следующие два параметра при загрузке ядра:

pci = nomsi — отключает использование прерываний MSI;
pci = noaer — отключает расширенный отчёт об ошибках ядра.

Однако, возвращаться к использованию устаревших методов доставки прерываний мне совсем не хотелось — костыли, так себе. Решил пойти другим путём и просто отключить технологию ASPM (Active-State Power Management).

ASPM позволяет управлять энергопотреблением шин PCI Express (PCIe) посредством их перевода в энергосберегающий режим, если устройство не используются. В то же время, активация ASPM приводит к задержке ответа от устройств, так как требуется некоторое время на переключение режимов работы шины.

Поведением ASPM можно управлять, добавив параметр pcie_aspm в загрузчик или внеся изменения в файле /sys/module/pcie_aspm/parameters/policy (pcie_aspm=off отключает ASPM, а pcie_aspm=force принудительно задействует, даже на поддерживающих технологию устройствах). Я подправил загрузчик GRUB:

В моём случае, ошибку выдавал встроенный Wi-Fi адаптер, который в принципе не отличался стабильностью в работе и постоянно отваливался. Отдельно отключить энергосбережение для WiFi поможет продвинутая консольная утилита для управления питанием TLP (/etc/default/tlp):

Если считаете статью полезной,
не ленитесь ставить лайки и делиться с друзьями.

Источник

Troubleshooting PCIe Bus Error severity Corrected on Ubuntu and Linux Mint

Last updated December 21, 2019 By Community 25 Comments

Recently I was trying to install Mint on several nodes in my institute. At times, I was not able to install and got lots of ‘PCIe Bus’ errors on the screen. I have also observed similar issue with Ubuntu 18.04.

I got stuck into it for more than a month, after using many solution and observations (solution is the same, but observation and treatment may be different), I found something which was helpful for me and I think could be helpful for other Ubuntu and Linux Mint users.

Observations about PCIe Bus Error severity Corrected

It happened with my HP system and it seems that there is some compatibility issues with the HP hardware. The PCIe Bus Error is basically the Linux kernel reporting the hardware issue.

This error reporting turns into nightmare because of the frequency of error messages generated by the system. I have noticed in various Linux forums that many HP user have encountered this error, probably HP needs to improve Linux support for their hardware.

Do note that this doesn’t necessarily mean that you cannot use Linux on your HP system. You might be able to use Linux like everyone else. It’s just that seeing this message flashing on the screen on every boot is annoying and sometimes, it could lead to bigger troubles.

If the system keeps on reporting, it will increase the log size. If you have limited space for root, it could mean that your system will stuck at the black screen displaying the PCIe error message and your system won’t be able to boot.

Now that you know a few things, let’s see how to tackle this error.

Handling PCIe Bus Error messages if you can boot in to your Linux system

If you see the PCIe Bus Error message on the screen while booting but you are still able to log in, you could do a workaround for this annoyance.

You can do little on the hardware compatibility front. I mean you (most probably) cannot go ahead and start coding drivers for your hardware or fix the existing drivers code. If your system works fine, your main concern should be that too much of error reporting doesn’t eat up the disk space.

Читайте также:  Можно ли установить нелицензионный windows

In that regard, you can change the Linux kernel parameter and ask it to stop reporting the PCIe errors. To do that, you need to edit the grub configuration.

Basically, you just have to use a text editor for editing the file.

First thing first, make a backup of your grub config file so that you can revert in case if you are not sure of things you changed. Open a terminal and use the following command:

Now open the file with Gedit for editing:

Look for the line that has GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”

Add pci=noaer in this line. AER stands for Advanced Error Reporting and ‘noaer’ asks the kernel to not use/log Advanced Error Reporting. The changed line should look like this:

Once you have saved the file, you should update the grub using this command:

Restart Ubuntu and you shouldn’t see the ‘PCIe Bus Error severity Corrected messages’ anymore.

If this doesn’t fix the issue for you, you can try to change other kernel parameters.

Further troubleshooting: Disable MSI

Now you are resorting to hit and trial. You may try disabling MSI. Though Linux kernel supports MSI for several years now, a wrong implementation of MSI from some hardware manufacturer may lead to the PCIe errors.

The drill is practically the same as you saw in the previous section. You edit the grub configuration and make the GRUB_CMDLINE_LINUX_DEFAULT line look like this:

Update grub and reboot the system:

Even further troubleshooting: Disable mmconf

I know it’s getting repetitive but if you are still facing the issue, it could be worth to give this a last try. This time, disable the mmconf parameter in Linux kernel.

mmconf means memory mapped config and if you have an old computer, a buggy BIOS may lead to this issue.

The steps remain the same. Just change the line GRUB_CMDLINE_LINUX_DEFAULT in your grub config to make it look like:

Can’t boot! How to edit grub config now?

In some cases, if you are not even able to boot at all, perhaps your root is out of space. An idea here would be to delete old log files and see if you could boot now and if yes, change the grub config.

On reboot, if you stuck with logs on the screen and do a hard boot (use power button to turn it off and on again). When you power on, choose to go in to recovery mode from the grub screen. It should be under Advanced options.

If your system doesn’t show the grub screen, press and hold shift key at boot. In some systems, pressing the Esc key brings the grub screen.

In the advanced option->recovery mode:

Drop into root shell:

If you use the ls command to find large files, you’ll see that sys.log and kern.log take huge space:

Once that is done, reboot your system. You should be able to log in. You should quickly change the grub parameters as discussed above. Adding pci=noaer should help you in this case.

I know it’s more of a workaround than solution. But this is something that troubled me long and helped me get around the error. Otherwise I had to reinstall the system.

I just wanted to share what worked for me with the community here. I hope it helps you as well.

This article is written by Arun Shrimali. Arun is IT Head at Resonance Institute in India and he tries to implement Open Source Software across his organization.

The article has been edited by Abhishek Prakash.

Like what you read? Please share it with others.

Источник

Ошибки PCIe Bus Error при выключении Linux

После обновления Linux, при следующем выключении компьютера на экран повалилась куча одинаковых сообщений об ошибке: » pcieport 0000:00:1c.3: PCIe Bus Error: severity=Corrected, type=Physical Layer «. Причём длится это безобразие довольно долго, так что рука тянется принудительно выключить компьютер с кнопки, что я пару раз и проделал, пока не нашёл решения проблемы.

Как я уже сказал выше, выключение или перезагрузка компьютера стала адовой процедурой — лог повторяющихся ошибок бежал по экрану в течении (!) двух минут. Оказалось, что данная проблема постигла практически всех владельцев интеловских процессоров и, по всей видимости, связана с попытками понизить питание на порту PCIe. На тематических Linux-форумах предлагается добавить следующие два параметра при загрузке ядра:

pci = nomsi — отключает использование прерываний MSI;
pci = noaer — отключает расширенный отчёт об ошибках ядра.

Однако, возвращаться к использованию устаревших методов доставки прерываний мне совсем не хотелось — костыли, так себе. Решил пойти другим путём и просто отключить технологию ASPM (Active-State Power Management).

ASPM позволяет управлять энергопотреблением шин PCI Express (PCIe) посредством их перевода в энергосберегающий режим, если устройство не используются. В то же время, активация ASPM приводит к задержке ответа от устройств, так как требуется некоторое время на переключение режимов работы шины.

Читайте также:  Windows 64 bit regedit

Поведением ASPM можно управлять, добавив параметр pcie_aspm в загрузчик или внеся изменения в файле /sys/module/pcie_aspm/parameters/policy (pcie_aspm=off отключает ASPM, а pcie_aspm=force принудительно задействует, даже на поддерживающих технологию устройствах). Я подправил загрузчик GRUB:

# открываем файл настроек grub в текстовом редакторе
sudo editor /etc/default/grub
# добавляем параметр pcie_aspm=off
GRUB_CMDLINE_LINUX_DEFAULT=»quiet pcie_aspm=off»
# сохраняем и обновляем загрузчик
sudo update-grub

В моём случае, ошибку выдавал встроенный Wi-Fi адаптер, который в принципе не отличался стабильностью в работе и постоянно отваливался. Отдельно отключить энергосбережение для WiFi поможет продвинутая консольная утилита для управления питанием TLP (/etc/default/tlp):

Подписывайтесь на канал Яндекс.Дзен и узнавайте первыми о новых материалах, опубликованных на сайте.

ЕСЛИ СЧИТАЕТЕ СТАТЬЮ ПОЛЕЗНОЙ,
НЕ ЛЕНИТЕСЬ СТАВИТЬ ЛАЙКИ И ДЕЛИТЬСЯ С ДРУЗЬЯМИ.

Источник

Arch Linux

You are not logged in.

#1 2017-04-29 11:42:22

PCIe Bus Error

I bought a new Laptop and installed Arch Linux with the i3 Window Manager. Everytime I shutdown my PC while my wireless interface is up, this message

is printed till I shutdown my PC manually.

device at 0000:00:1c.5:

Any suggestions to fix this?

Last edited by KeyHoi (2017-05-01 12:20:35)

#2 2017-04-29 18:33:32

Re: PCIe Bus Error

Do not post pictures of output, post the actual text.

Also, can you elaborate more on this line:

Everytime I shutdown my PC while my wireless interface is up, this message(the red one) is printed till I shutdown my PC manually.

How do you initiate the shutdown and what do you mean by manual shutdown?

#3 2017-04-29 19:18:41

Re: PCIe Bus Error

What is the output lspci ? In particular what device is at 0000:00:1c.5 ?

Nothing is too wonderful to be true, if it be consistent with the laws of nature — Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. — Alan Turing

How to Ask Questions the Smart Way

#4 2017-04-29 20:42:22

Re: PCIe Bus Error

Dunno what it is, except that it’s some error reported by hardware and possibly caused by things being powered down.

Just write some scripts which rmmods the WiFi driver at shutdown if you say it helps

#5 2017-04-29 23:15:21

Re: PCIe Bus Error

Do not post pictures of output, post the actual text.

Also, can you elaborate more on this line:

Everytime I shutdown my PC while my wireless interface is up, this message(the red one) is printed till I shutdown my PC manually.

How do you initiate the shutdown and what do you mean by manual shutdown?

I hold down the poweroff button.

#6 2017-04-29 23:16:54

Re: PCIe Bus Error

Dunno what it is, except that it’s some error reported by hardware and possibly caused by things being powered down.

Just write some scripts which rmmods the WiFi driver at shutdown if you say it helps

The problem is I can’t enable NetworkManager, because if i do so, the message is printed at boot up till i shutdown the Laptop.

#7 2017-04-30 05:41:07

Re: PCIe Bus Error

So it’s not just every time you shutdown with the interface up but all the time the interface is up?

OK, since it says the errors are already corrected anyway, maybe just disable error reporting Try adding pci=noaer to the kernel boot parameters.

#8 2017-04-30 16:25:14

Re: PCIe Bus Error

Well, what we learned from your truncated lspci output was that the device that was throwing the error was the root port. Had you posted that for which I had asked, we would know what devices were on that bus and, perhaps, gained some insight into the root cause.

Nothing is too wonderful to be true, if it be consistent with the laws of nature — Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. — Alan Turing

How to Ask Questions the Smart Way

#9 2017-05-01 12:21:18

Re: PCIe Bus Error

Well, what we learned from your truncated lspci output was that the device that was throwing the error was the root port. Had you posted that for which I had asked, we would know what devices were on that bus and, perhaps, gained some insight into the root cause.

#10 2017-05-01 12:30:42

Re: PCIe Bus Error

I hold down the poweroff button.

Personally, I would not be shutting down that way. If I’m not mistaken, that is a «forced shutdown». Try «poweroff» from a prompt or something. (This is probably not the cause of the issue, but just general advice.)

«It is very difficult to educate the educated.»

#11 2017-05-01 12:51:08

Re: PCIe Bus Error

I hold down the poweroff button.

Personally, I would not be shutting down that way. If I’m not mistaken, that is a «forced shutdown». Try «poweroff» from a prompt or something. (This is probably not the cause of the issue, but just general advice.)

Читайте также:  Id tarkvara windows 10

I know, but I have to. The message is printed every 0.1s till I do a forced shutdown.

#12 2017-05-01 13:56:54

Re: PCIe Bus Error

And we remain information starved.

Nothing is too wonderful to be true, if it be consistent with the laws of nature — Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. — Alan Turing

How to Ask Questions the Smart Way

#13 2017-05-01 14:10:49

Re: PCIe Bus Error

.
And we remain information starved.

What do you mean? I added the output of lspci -v

#14 2017-05-01 15:02:59

Re: PCIe Bus Error

The requested info does appear to be in the lspci output.

So attached to the affected root-port is bus 3 (secondary=03). Which is the wireless network device.

This is an interesting problem, and I would wager it is a kernel root-port driver issue or a realtek driver issue (since the root-port is the device flagging the receiver errors, this likely means it is getting PCIe framing errors from the attached realtek device, or the attached PCIe device is going to electrical idle, and the root-port is not properly detecting it.

But something during the reset process may be causing the realtek device to go haywire with an unclean shutdown flow. Normally Intel hardware reset flows have a watchdog timer that will eventually force a global-reset even if a device is not responding, but you mentioned that reset never occurs and you are forced to issue a cold-reset (holding down power button). This might indicate that the OS is not even reaching the point of asking for a reset because it is stuck waiting on something else (hence why it could be a kernel root-port related issue with reset under the presence of errors which trigger interrupts).

I would suggest trying the LTS kernel, and seeing if it useable and if there is any reset behavior difference. I do know that skylake took a long time to be useable in linux, and so I am unsure if the LTS kernel will work for that hardware, but it will provide some extra data points that may help.

EDIT: Also check your BIOS settings and see if you have PCIe ASPM enabled. If so, try disabling it. If there are separate bios settings for L0s and L1, disable both.

Edit2:: Additionally, make sure you have intel-ucode installed and enabled in your boot loader. The reset flows are controlled in SunrisePoint by firmware that lives in the PMC (power-management controller). This gets loaded by BIOS, so also check your laptop vendor for BIOS updates, as there could be fixes available.

Last edited by blahhumbug (2017-05-01 15:49:58)

#15 2018-07-20 08:51:45

Re: PCIe Bus Error

Hi, I’m brand new to Arch but not Linux, so please excuse my ignorance if I’ve said anything wrong. I’m trying to install Arch and I’ve run into the same PCIE bus error on the same port (00:1c.5), but in my case I can’t even do anything about it . I’ve been using KDE Neon till now and used to have the same problem on boot, but I managed to add pci=nomsi to my grub and this made the error not show up on boot.

But I can’t seem to do the same from my live USB install of Arch and the error just keeps printing on screen wherever I am so I can’t edit files too (not sure if those printed errors will end up saving in the grub file too, besides I am not even sure if the grub file on the Live USB is writable and whether changes will even be saved on the next boot. ) . I also dual boot Windows 10. Any help would be great, I really want to try out Arch. I heard the community is great too. I posted the same problem on Kde forums and no one gives a damn there.

I’ve also read the replies above, PCIe ASPM doesn’t show up on my BIOS. Is it possible to install intel-ucode in the storm of errors being printed on my console? Or is is a fundamental problem with my laptop? I can’t remove Windows as I use that for my games and have a ton of stuff there too.

#16 2018-07-20 11:06:09

Re: PCIe Bus Error

Please do not necrobump.
Ensure windows fastboot is off, https://www.tenforums.com/tutorials/418 … -10-a.html
If that’s settled, please open a new thread, post the actual message and an lspci output.

The «error» is none, there’re problems to communicate w/ the PCI device but the call is repeated and the error corrected. (Ie. a «loose contact», works, works not, works…)

#17 2018-07-20 11:19:26

Re: PCIe Bus Error

As Seth notes, please do not necrobump.

Sakura:-
Mobo: MSI MEG B550 UNIFY // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD

Источник

Оцените статью