- Linux Kernel Panic at «Code: Bad RIP value.» with One-bit Flip in RIP (Doc ID 2407801.1)
- Applies to:
- Symptoms
- Cause
- To view full details, sign in with your My Oracle Support account.
- Don’t have a My Oracle Support account? Click to get started!
- BrainyCP
- Падает сервер в результате переполнения RAM
- Падает сервер в результате переполнения RAM
- Re: Падает сервер в результате переполнения RAM
- Re: Падает сервер в результате переполнения RAM
- Re: Падает сервер в результате переполнения RAM
- Kernel Panic — Bad RIP value #2007
- Comments
- janeczku commented Jul 21, 2017
- SvenDowideit commented Jul 21, 2017
- janeczku commented Jul 21, 2017 •
- lynsei commented Jul 27, 2017
- lynsei commented Jul 27, 2017
- SvenDowideit commented Jul 28, 2017
- System hangs (bad RIP value) when disk used in pool is removed (zfs-0.6.5.1) #3821
- Comments
- ab-oe commented Sep 23, 2015
- kernelOfTruth commented Sep 23, 2015
- behlendorf commented Sep 23, 2015
- ab-oe commented Sep 24, 2015
- ab-oe commented Sep 24, 2015
- behlendorf commented Sep 24, 2015
- ab-oe commented Sep 25, 2015
- behlendorf commented Sep 25, 2015
- behlendorf commented Sep 25, 2015
- ab-oe commented Sep 28, 2015
- Bad RIP value from gtk3-nocsd package #286
- Comments
- marekmarecki commented Oct 22, 2020 •
- kisak-valve commented Oct 22, 2020 •
- marekmarecki commented Oct 22, 2020 •
- marekmarecki commented Oct 24, 2020
- smcv commented Oct 26, 2020
- marekmarecki commented Oct 28, 2020
- aib commented Oct 30, 2020
Linux Kernel Panic at «Code: Bad RIP value.» with One-bit Flip in RIP (Doc ID 2407801.1)
Last updated on APRIL 24, 2020
Applies to:
Symptoms
Linux server got unplanned reboot several times. From the crash messages :
[ 5486.019099] BUG: unable to handle kernel paging request at ffffffdf816c7621
[ 5486.019769] IP: [ ] 0xffffffdf816c7621
[ 5486.020436] PGD 1a8d067 PUD 0
[ 5486.021172] Oops: 0010 [#1] SMP
.
[ 5486.037278] task: ffff880415608e00 ti: ffff880415610000 task.ti: ffff880415610000
[ 5486.039652] RIP: 0010:[ ] [ ] 0xffffffdf816c7621
..
[ 5486.106311] Code: Bad RIP value.
[ 5486.110081] RIP [ ] 0xffffffdf816c7621
^———————- the RIP value is 0xffffffdfxxxxxxx, with one-bit flipped from 0xffffffffxxxxxxx.
The RIP is NOT a NULL pointer, but an invalid kernel space address like above.
Cause
To view full details, sign in with your My Oracle Support account.
Don’t have a My Oracle Support account? Click to get started!
In this Document
Symptoms |
Cause |
Solution |
References |
My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.
Oracle offers a comprehensive and fully integrated stack of cloud applications and platform services. For more information about Oracle (NYSE:ORCL), visit oracle.com. пїЅ Oracle | Contact and Chat | Support | Communities | Connect with us | |
|
| Legal Notices | Terms of Use
Источник
BrainyCP
Панель управления сервером
Падает сервер в результате переполнения RAM
Падает сервер в результате переполнения RAM
Сообщение romapad » Пт фев 15, 2019 8:26 pm
Поставил чистый centoc 7, поверх brainycp. Заметил, что через какое-то время сервер падает — невозможно зайти ни в панель, ни в консоль ssh. Помогает только перезагрузка из аккаунта хостера. Через какое-то время сервер опять падает. В messages вот такой лист ошибок вываливается:
Feb 13 14:14:27 mkmv systemd: systemd-logind.service watchdog timeout (limit 3min)!
Feb 13 14:15:26 mkmv kernel: INFO: task systemd-logind:1465 blocked for more than 120 seconds.
Feb 13 14:15:26 mkmv kernel: Not tainted 4.18.15-1.el7.elrepo.x86_64 #1
Feb 13 14:15:26 mkmv kernel: «echo 0 > /proc/sys/kernel/hung_task_timeout_secs» disables this message.
Feb 13 14:15:26 mkmv kernel: systemd-logind D 0 1465 1 0x00000084
Feb 13 14:15:26 mkmv kernel: Call Trace:
Feb 13 14:15:26 mkmv kernel: __schedule+0x2ab/0x880
Feb 13 14:15:26 mkmv kernel: ? __wake_up_common+0x8f/0x160
Feb 13 14:15:26 mkmv kernel: schedule+0x36/0x80
Feb 13 14:15:26 mkmv kernel: schedule_timeout+0x1dc/0x300
Feb 13 14:15:26 mkmv kernel: ? radix_tree_iter_tag_set+0x1b/0x20
Feb 13 14:15:26 mkmv kernel: wait_for_completion+0x121/0x180
Feb 13 14:15:26 mkmv kernel: ? wake_up_q+0x80/0x80
Feb 13 14:15:26 mkmv kernel: __wait_rcu_gp+0x123/0x150
Feb 13 14:15:26 mkmv kernel: synchronize_sched+0x5e/0x80
Feb 13 14:15:26 mkmv kernel: ? __call_rcu+0x2d0/0x2d0
Feb 13 14:15:26 mkmv kernel: ? __bpf_trace_rcu_utilization+0x10/0x10
Feb 13 14:15:26 mkmv kernel: namespace_unlock+0x6a/0x80
Feb 13 14:15:26 mkmv kernel: ksys_umount+0x236/0x450
Feb 13 14:15:26 mkmv kernel: ? syscall_trace_enter+0x1cd/0x2b0
Feb 13 14:15:26 mkmv kernel: __x64_sys_umount+0x16/0x20
Feb 13 14:15:26 mkmv kernel: do_syscall_64+0x60/0x190
Feb 13 14:15:26 mkmv kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb 13 14:15:26 mkmv kernel: RIP: 0033:0x7fb4e1771f47
Feb 13 14:15:26 mkmv kernel: Code: Bad RIP value.
Feb 13 14:15:26 mkmv kernel: RSP: 002b:00007ffef4b48b18 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Feb 13 14:15:26 mkmv kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb4e1771f47
Feb 13 14:15:26 mkmv kernel: RDX: 000000000000a740 RSI: 0000000000000002 RDI: 000055d874c4b7d0
Feb 13 14:15:26 mkmv kernel: RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000008030
Feb 13 14:15:26 mkmv kernel: R10: 0000000000000076 R11: 0000000000000246 R12: 000055d874c4b2e0
Feb 13 14:15:26 mkmv kernel: R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
Feb 13 14:21:57 mkmv systemd: systemd-logind.service still around after final SIGKILL. Entering failed mode.
Feb 13 14:21:57 mkmv systemd: Unit systemd-logind.service entered failed state.
Feb 13 14:21:57 mkmv systemd: systemd-logind.service failed.
Feb 13 14:21:57 mkmv systemd: systemd-logind.service has no holdoff time, scheduling restart.
Feb 13 14:21:57 mkmv systemd: Stopped Login Service.
Feb 13 14:21:57 mkmv systemd: Starting Login Service.
Feb 13 14:21:57 mkmv systemd-logind: Failed to register name: File exists
Feb 13 14:21:57 mkmv systemd-logind: Failed to fully start up daemon: File exists
Feb 13 14:21:57 mkmv systemd: systemd-logind.service: main process exited, code=exited, status=1/FAILURE
Re: Падает сервер в результате переполнения RAM
Сообщение Deite » Вс фев 17, 2019 2:26 am
Re: Падает сервер в результате переполнения RAM
Сообщение Den » Пн фев 18, 2019 12:36 pm
Похоже у меня такая же хрень после обновления панели. Лучше бы я её не обновлял.
Сегодня утром обнаружил, что отвалился мемкешед. Пытался перезапустить его, но никакие команды перезапуска, старта и стопа не работают, ни из консоли, ни из панели. Тоже самое касается nginx и всего остального. При попытке перезапустить nginx он просто упал и больше не поднялся. Всё решилось перезапуском сервера. Но через пару часов память и своп забились под завязку и процессор под 100% загружен, всё повисло.
Даже reboot из консоли не работает, а только из панели хостера!
Это капец какой-то!
Как откатить версию панели обратно?
Re: Падает сервер в результате переполнения RAM
Сообщение vikont » Пн фев 18, 2019 4:42 pm
Это капец какой-то!
Как откатить версию панели обратно?
Мне помогла полная переустановка сервера! Потому как разбираться почему глючит phpMyAdmin, почему пропали данные одного сайта и разалился nginx и прочие проблемы просто того не стоили!
Каждое обновление ставит нас на выживаемость! Это такая тактика? Когда мы наконец перестанем задать гамлетовский вопрос:»Быть или не быть. «?
Источник
Kernel Panic — Bad RIP value #2007
Comments
janeczku commented Jul 21, 2017
RancherOS Version: (ros os version)
v1.0.0
Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.)
VMWARE
The text was updated successfully, but these errors were encountered:
SvenDowideit commented Jul 21, 2017
is this a plain unconfigured RancherOS, or how is it configured?
and have you tried a more recent kernel — like in 1.0.3 — 1.0.0 is very different.
janeczku commented Jul 21, 2017 •
Its a completely configured host. It is also configured with ZFS on top of NAS VMDKs and SSD caches (it was setup using a custom container, ie. not using the zfs os-service)
lynsei commented Jul 27, 2017
@janeczku are you sure this host was configured with ZFS? This is just a plain old ds host as far as I can tell . I don’t believe we did anything special for ZFS on it.
lynsei commented Jul 27, 2017
We have not tried 1.0.3 and have still been unable to reproduce this error since then in the wild. @SvenDowideit
SvenDowideit commented Jul 28, 2017
@janeczku we could do with further clarification, and I’ll be reading a few changelogs to see what they have.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Источник
System hangs (bad RIP value) when disk used in pool is removed (zfs-0.6.5.1) #3821
Comments
ab-oe commented Sep 23, 2015
Hello,
I created new zpool with one disk and started copying files. Then I unplugged this disk and system hung instead of suspending I/O. After a few tests I was able to capture the call trace:
On version 0.6.4 everything works well. There is no possibility to get I/O suspended with the latest ZoL because system always hangs.
The text was updated successfully, but these errors were encountered:
kernelOfTruth commented Sep 23, 2015
@ab-oe thanks for the report !
referencing: #3817 Hang on USB disconnect
that report also shows
behlendorf commented Sep 23, 2015
@ab-oe f you’re able to reproduce this could you try setting the module option spl_taskq_thread_dynamic=0 . That may resolve the issue and we an acceptable work around until it can be address properly. This effectively reverts the taskq’s to their 0.6.4 behavior.
ab-oe commented Sep 24, 2015
@behlendorf unfortunatelly setting spl_taskq_thread_dynamic to 0 doesn’t resolve this issue. System hung just like before.
ab-oe commented Sep 24, 2015
I performed bisection and see that issue with missing I/O suspend was introduced in b39c22b the z_null_int takes 100% of CPU. System is hardly responsive but it still works.
I got following call trace:
behlendorf commented Sep 24, 2015
@ab-oe thanks for posting the debugging. This definitely looks like a duplicate of #3652, and it’s clear that the z_null_int thread is getting blocked spinning on the taskq spin lock. Thanks for bisecting the change, that’s helpful. Have you tried reverting b39c22b and setting spl_taskq_thread_dynamic to 0. Does it resolve the issue?
ab-oe commented Sep 25, 2015
@behlendorf yes it works with reverted b39c22b it works even if spl_taskq_thread_dynamic is set to 1. I haven’t found the commit that causes immediate system hang when disk is removed yet.
behlendorf commented Sep 25, 2015
Fix proposed in #3833.
behlendorf commented Sep 25, 2015
Resolved by 5592404 which will be cherry-picked in to 0.6.5.2 release.
ab-oe commented Sep 28, 2015
@behlendorf thank you. The 5592404 fixes this issue.
There is another issue #3577 I tested it lately on ZFS version with b39c22b and it seems that WRITE_SYNC resolved it and now it is introduced again.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Источник
Bad RIP value from gtk3-nocsd package #286
Comments
marekmarecki commented Oct 22, 2020 •
Proton version: 5.13-1
Ubuntu 20.04.1, nvidia
Games affected (all non-native games that I have installed are affected):
- Age of Empires II: Definitive Edition (id 813780)
- Age of Empires II (2013) (id 221380)
- Fallout: New Vegas (id 22380)
- Grand Theft Auto IV: The Complete Edition (id 12210)
With proton 5.0-9 everything above was launching ok.
I can see some strange things in syslog:
Oct 22 20:53:38 ubuntu-pc kernel: [ 593.762926] pressure-vessel[12294]: segfault at 0 ip 0000000000000000 sp 00007fffaeee1eb8 error 14 in pressure-vessel-launch[400000+14000] Oct 22 20:53:38 ubuntu-pc kernel: [ 593.762933] Code: Bad RIP value.
Logs that I collected:
- Steam’s stdout from launching steam client and trying running AoE2:DE. Not much info here.
- stdout with more info (
/.steam/errror.log)
Putting PROTON_LOG=1 %command% in launch options is not creating any logs (it was creating logs with proton 5.0-9).
Where should I look to get to the bottom of the problem?
Small update:
I have the exact same problem (the same errors in logs) on another machine (also Ubuntu 20.04.1
but steam installed from package provided by Steam’s site, not provided by distro — if that makes any difference anyway)
The text was updated successfully, but these errors were encountered:
kisak-valve commented Oct 22, 2020 •
Hello @marekmarecki, this reads like a pressure vessel issue, which Proton 5.13 runs on top of. I suspect that the debian-modified Steam package is stealing any useful hints from the terminal spew and putting it in
/.steam/error.log . Can you check if that exists and if there’s any hints in it?
marekmarecki commented Oct 22, 2020 •
Hi @kisak-valve. My bad. When I last checked this file, there was nothing interesting inside. Now it is generating more info. I updated the problem description with another gist. There are some errors there, like this for example:
ERROR: ld.so: object ‘/home/marek/.steam/ubuntu12_32/gameoverlayrenderer.so’ from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
/usr/lib/x86_64-linux-gnu/gvfs/libgvfscommon.so: undefined symbol: g_task_new
Failed to load module: /usr/lib/x86_64-linux-gnu/gio/modules/libgvfsdbus.so
Segmentation fault
Here is content of /usr/lib/x86_64-linux-gnu/gio/modules/:
-rw-r—r— 1 root root 270 lip 8 21:44 giomodule.cache
-rw-r—r— 1 root root 67656 mar 11 2020 libdconfsettings.so
-rw-r—r— 1 root root 22680 cze 23 08:45 libgiognomeproxy.so
-rw-r—r— 1 root root 133408 cze 23 08:45 libgiognutls.so
-rw-r—r— 1 root root 18584 cze 23 08:45 libgiolibproxy.so
-rw-r—r— 1 root root 133344 kwi 14 2020 libgioremote-volume-monitor.so
-rw-r—r— 1 root root 231928 kwi 14 2020 libgvfsdbus.so
marekmarecki commented Oct 24, 2020
I can confirm that removing gtk3-nocsd package from system and rebooting OS solves the problem: log from game running on 5.13-1. Confirmed on two machines. Big thanks to aib’s comment.
smcv commented Oct 26, 2020
The libgtk3-nocsd.so.0 preload library is known to break non-Steam programs quite regularly, too, and the GTK maintainers consider it to be a problem. I would recommend not using it.
If we do anything to solve this crash, it will probably be the brute-force approach: when we are adjusting LD_PRELOAD to work inside the container, remove any entry that looks like it might be libgtk3-nocsd .
marekmarecki commented Oct 28, 2020
I removed the package, no problem. It was not installed directly, just as some dependency at some point in the past, I do not need it. But some folks may also have it and not know what to do, especially if it don’t come up in logs that easily (in my attached logs there is no mention of this lib; would need to dig deeper to find it).
Cheers and keep up the good work 👍
aib commented Oct 30, 2020
Indeed; I used gdb to trace the segfault to nocsd. And there’s no direct connection to Steam or Proton, so it would be hard to stumble upon the solution. That’s why I commented; glad to see it helped!
Источник