Stale file handle linux

Содержание

ОШИБКА ДЕСКРИПТОРА УСТАРЕВШЕГО ФАЙЛА NFS И РЕШЕНИЕ — LINUX — 2021
Как мне решить эту проблему?
How to resolve mount.nfs: Stale file handle error
NFS Stale File Handle error and solution
How do I fix this problem?

ОШИБКА ДЕСКРИПТОРА УСТАРЕВШЕГО ФАЙЛА NFS И РЕШЕНИЕ — LINUX — 2021

Иногда NFS может приводить к странным проблемам. Например, каталоги, смонтированные по NFS, иногда содержат устаревшие дескрипторы файлов. Если вы запустите такую команду, как ls или vi, вы увидите ошибку:
$ ls
.: Дескриптор устаревшего файла

Сначала давайте попробуем понять концепцию дескриптора устаревшего файла. Управление NFS и NIS, книга 2-го издания определяет дескрипторы файлов следующим образом (хорошая книга, если вы хотите освоить NFS и NIS):
дескриптор файла становится устаревшим всякий раз, когда файл или каталог, на который ссылается дескриптор, удаляются другим хостом, в то время как ваш клиент все еще содержит активную ссылку на объект. Типичный пример происходит, когда текущий каталог процесса, запущенного на вашем клиенте, удаляется на сервере (либо процессом, запущенным на сервере, либо на другом клиенте).

Таким образом, это может произойти, если каталог изменен на сервере NFS, но время изменения каталогов не обновлено.

Как мне решить эту проблему?

a) Лучшее решение — перемонтировать каталог из клиента NFS с помощью команды mount:
# umount -f /mnt/local
# mount -t nfs nfsserver:/path/to/share /mnt/local

Первая команда (umount) принудительно размонтирует раздел диска / mnt / local (NFS).

(b) Или попробуйте смонтировать каталог NFS с параметром noac. Однако я не рекомендую использовать параметр noac из-за проблем с производительностью, и проверка файлов в файловой системе NFS, на которые ссылаются файловые дескрипторы (т.е. семейства функций fcntl и ioctl), может привести к несогласованному результату из-за отсутствия проверки согласованности в ядре, даже если noac используется.

Источник

How to resolve mount.nfs: Stale file handle error

Published: January 1, 2017 | Modified: June 20, 2020

Learn how to resolve mount.nfs: Stale file handle error on the Linux platform. This is a Network File System error that can be resolved from the client or server end.

When you are using the Network File System in your environment, you must have seen mount.nfs: Stale file handle error at times. This error denotes that the NFS share is unable to mount since something has changed since the last good known configuration.

Whenever you reboot the NFS server or some of the NFS processes are not running on the client or server or share is not properly exported at the server; these can be reasons for this error. Moreover, it’s irritating when this error comes to a previously mounted NFS share. Because this means the configuration part is correct since it was previously mounted. In such case once can try the following commands:

Make sure NFS service are running good on client and server.

If NFS share currently mounted on the client, then un-mount it forcefully and try to remount it on NFS client. Check if its properly mounted by df command and changing directory inside it.

In above mount command, server can be IP or hostname of NFS server.

If you are getting error while forcefully un-mounting like below :

Then you can check which all processes or users are using that mount point with lsof command like below:

If you see in above example that 4 PID are using some files on said mount point. Try killing them off to free mount point. Once done you will be able to un-mount it properly.

Sometimes it still gives the same error for mount command. Then try mounting after restarting NFS service at the client using the below command.

Even if this didn’t solve your issue, final step is to restart services at the NFS server. Caution! This will disconnect all NFS shares which are exported from the NFS server. All clients will see the mount point disconnect. This step is where 99% of you will get your issue resolved. If not then NFS configurations must be checked, provided you have changed configuration and post that you started seeing this error.

Outputs in above post are from RHEL6.3 server. Drop us your comments related to this post.

Источник

NFS Stale File Handle error and solution

Sometime NFS can result in to weird problems. For example NFS mounted directories sometimes contain stale file handles. If you run command such as ls or vi you will see an error:
$ ls
.: Stale File Handle

First let us try to understand the concept of Stale File Handle. Managing NFS and NIS, 2nd Edition book defines filehandles as follows (a good book if you would like to master NFS and NIS):
A filehandle becomes stale whenever the file or directory referenced by the handle is removed by another host, while your client still holds an active reference to the object. A typical example occurs when the current directory of a process, running on your client, is removed on the server (either by a process running on the server or on another client).

So this can occur if the directory is modified on the NFS server, but the directories modification time is not updated.

How do I fix this problem?

a) The best solution is to remount directory from the NFS client using mount command:
# umount -f /mnt/local
# mount -t nfs nfsserver:/path/to/share /mnt/local

No ads and tracking
In-depth guides for developers and sysadmins at Opensourceflare✨
Join my Patreon to support independent content creators and start reading latest guides:
- How to set up Redis sentinel cluster on Ubuntu or Debian Linux
- How To Set Up SSH Keys With YubiKey as two-factor authentication (U2F/FIDO2)
- How to set up Mariadb Galera cluster on Ubuntu or Debian Linux
- A podman tutorial for beginners – part I (run Linux containers without Docker and in daemonless mode)
- How to protect Linux against rogue USB devices using USBGuard

Join Patreon ➔

First command (umount) forcefully unmount a disk partition /mnt/local (NFS).

(b) Or try to mount NFS directory with the noac option. However I don’t recommend using noac option because of performance issue and Checking files on NFS filesystem referenced by file descriptors (i.e. the fcntl and ioctl families of functions) may lead to inconsistent result due to the lack of consistency check in kernel even if noac is used.

Category	List of Unix and Linux commands
Documentation	help • mandb • man • pinfo
Disk space analyzers	df • duf • ncdu • pydf
File Management	cat • cp • less • mkdir • more • tree
Firewall	Alpine Awall • CentOS 8 • OpenSUSE • RHEL 8 • Ubuntu 16.04 • Ubuntu 18.04 • Ubuntu 20.04
Linux Desktop Apps	Skype • Spotify • VLC 3
Modern utilities	bat • exa
Network Utilities	NetHogs • dig • host • ip • nmap
OpenVPN	CentOS 7 • CentOS 8 • Debian 10 • Debian 8/9 • Ubuntu 18.04 • Ubuntu 20.04
Package Manager	apk • apt
Processes Management	bg • chroot • cron • disown • fg • glances • gtop • jobs • killall • kill • pidof • pstree • pwdx • time • vtop
Searching	ag • grep • whereis • which
Shell builtins	compgen • echo • printf
Text processing	cut • rev
User Information	groups • id • lastcomm • last • lid/libuser-lid • logname • members • users • whoami • who • w
WireGuard VPN	Alpine • CentOS 8 • Debian 10 • Firewall • Ubuntu 20.04

Comments on this entry are closed.

I encounter these errors on my Ubuntu client when I reboot my Fedora server with the NFS shares. I find I need to first umount the shares on the client, then restart nsf on the server, then remount the shares on the client. I don’t claim to understand the logic behind this — but it works.

I had this error on my home laptop, so it had nothing to do with servers. it was when I tried t install songbird on my computer, something went wrong and when I wanted to reinstall songbird my computer gave this error, I’ve tried to reboot, to delete the /usr/share/songbird directory but nothing worked. Finally I let it like it was and just ran the scripts while they weren’t in place (right from my home folder) and now, a few weeks later the problem hes resolved himself and I could reinstall songbird without problems. If you could say how this happened please let me know.

When a client mounts a server’s exported NFS mount point to the client’s specific mount point, client and server negotiate a unique id for that event (i.e. the id is unique for the mount, mount point, server, server exported file system, etc.), there is therefore a new unique ID for every successful mount request. All communications between the client and the server include this unique ID. When a server reboots, the server (intentionally) will have no record of the unique ID, hence the client will get an error when it tries to access the remote file system. The stale file system error is telling you that this has occurred.

Unmounting (force) and remounting from the client will resolve this problem IF there are no other references within the client that are retaining the handle/ID. I.e. this is a problem for processes that are running and have open file handles when you do the force unmount. When you force unmount/remount and still get the error, you have some program that is probably (?) not following proper file handle semantics about closing files.

However, watch out for a related problem, in autofs I have seen timeout options. These can cause a stale NFS handle in some combinations of NFS (version 2 also) client to server communications, and processes that have (correctly) open file handles.

So, a stale NFS handle occurring on a client after a server reboot, resolved on the client by an un/remount of the client’s file system is proper behaviour.

I just bot laptop with Limpus Linux an it and the first time I started gave me this notice
fale NFS Stale File Handle error : I dont have much experience with linux how can I solve the problem , or instal windows xp (when I tried it gave mi the similar error like stoped to not damage sistem)?

Usually I find some operation that will force the inode tables on client and server to update de-zombifies stale NFS data. Something as simple as remaking the mount point directory on the client (seen that solve some really weird stuff) or renaming a directory on the server and forcing the client to try to use it, then renaming it back. This is all assuming simple umount / mount and/or kill / restart stuff doesn’t work.

*cough*network failure system*cough*

Nice thanks for the tip…Simple 2 commands to fix it..Made a little bash script for them…thanks again.

using rpc.statd is better as it constantly checks the state of the nfs system and keeps the mount status up to date.

If you have the fstab entry, you can simply umount and then mount again using the mount point. Something like this:
umount -f /mnt/local
mount /mnt/local

I am running slax. I had the same problem after install a module. I uninstalled the module and the problem was gone

ls -alh in the offending directory regularly clears this problem up for me.

ls -alh /mnt/volume/directory won’t do it
ls -alh /mnt/volume/directory/ will.

Done! in one shot as umount -f mount point then mounted it

I was getting below error….
mount.nfs: Stale NFS file handle

#umount -f /mount point
# mount -t nfs nfs:server /mount point
#

I have this problem on my system running on windows. I am running an embedded program loaded on chip and not at all connected to server. when I will do ls it shows the error “Stale NFS file handle”. I am using Putty / hyperterminal . The solution mention above not working for me. please help me out………..

Thanking you in advanced.

I have my file system stored at hda2 partation which is of type ext2 and getting folling error
mount /dev/hda2 /mnt/hda2
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
#
# cd /mnt/hda2
#
# ls
ls: ./^J#: Stale NFS file handle
ls: ./AppInfo: Stale NFS file handle
ls: ./C: Stale NFS file handle

Settings selinux
alsa-mae-init launchup sys
b# lib test_snd
bin linuxrc tmp
config_init lost+found tmp_bac
data mnt
dev opt usr
etc proc var
i root
init sbin

# umount -l /mnt/point

# mount nfs:/path/mount /mnt/point

Above worked for me!

I guess this lazy umount option should be added in article. 🙂

Only this key help also for my!
Thank you very much!

You saved my day Nirmal, I was not able to umount with -f until i saw your solution. -l worked perfectly for me.. Thanks Nirmal again:)

Thanks a lot. My work is done with ur help.

The umount/mount fix is a *really* big hammer. Besides, it only fixes the symptoms, it doesn’t fix the disease. Hasn’t anyone written a fix so that when the Stale condition is detected, the filehandle/dirhandle in question is refreshed (made non-Stale)? Seems like a simple enough fix. We get the Stale handles between RedHat Ent. and Tru64, sometimes in a matter of seconds of making the mount. Neither file system has been restarted. A umount/mount will fix it for a short time, but it returns quickly. Using the NOAC option significantly reduces the frequency of occurrence, but also significantly reduces write speed to the mounted files, even within one open/write/close session, the write is greatly slowed. This is especially curious, since the file attributes should be immaterial during the writes, IMO. Our biggest problem is that we can’t pay a trained monkey to sit around watching for the Stale handle incidents and do the umount/mount 24/7. Besides the umount will affect other running processes. It’s hardly a win. Does anyone know of better solutions? Has anyone else had success with the `ls -alh’ fix?

Perhaps I am missing something, but I have never ONCE been able to get the above to work.

I always get stuck with:
# umount -f /mnt/home-ext
umount2: Device or resource busy
umount.nfs: /mnt/home-ext: device is busy

Linux can be so stupid sometimes, I mean how can it be busy if the mount is stale and so by its nature NOTHING is able to access it?

A process – maybe your own bash shell – has /mnt/home-ext as the current directory.
lsof |grep /mnt/home-ext shows what uses – or wants to use – that dir.

Always good tips

Can this error occure when NFS has short network outage ?

I really don’t know but I’m having a problem with a server which directories are mounted in a different server. My guess is like you said, a problem with the network, but so far, haven’t found the way of testing it.

Did you have a similar problem or did you get any answer to your question?

Thanks to Nirmal Pathak. -f was not enough…

if you are running nautilus file manager, you’ll probably find the problem is nautilus, not NFS at all. try “killall nautilus” from any shell prompt. works for me, so far..

This problem is haunting me in newer Fedora 17 installs. Autofs works fine, but is not timing out the mounts after the resource is left. So they seem to stay alive until something as the remote host being rebooted happens.
Then the stale mount thing…
But after forcing the umount (which seems to fix things partially, no more stale mount alerts), now I get:
Too many levels of symbolic links

Any hint on this ?
Thanks

I got same issue (mount.nfs: Stale NFS file handle) the first time I’ve attempted to mount a shared folder.
I dont really have anything to umount of in busy state

Any idea appreciated

The `ls -alhâ€™ always works fine with me. I just apply it at the parent directory of the one causing the error. After that, I can access to the directory/file without any problem.

on the server first, if that does not work then on the client try :

if that fails with device is busy/in-use, find the offending processes with:

and retry remount

I’ve been using NFS for the better part of 20 years and have run into this problem off and on but never found a solution until I came across this post.

exportfs -f on the server did the trick for me.

Thanks! This just got added into my toolbox if sysadmin tricks.

Have this error from time to time. Tried all solutions mentioned up to here.

Thanks Paul Freeman, exportfs -f was new to me and solved the problem without restart nfs or the server.

I found that i could not umount -f /path always getting the stale message.
when i looked at my server i could see that in the exportfs -av that the ip address listed was not what i was connected on anymore. Looking at my router i found that i had a Dynamic DHCP address. I added and a reservastion for my MAC address on my old IP address and reconnected my wireless then mounted and everything worked as before.

I had this problem and I was sure that files/directory are still available at host.
So, I just back traced one directory and came back on current directory, everything was working then.

Thank you. My Yum was locking up and when I did trace it was a stale nfs.

strace yum -y update

I unmounted and remounted and it worked again. Explains why yum and server issues.

I have tried most of these proposals on Debian — nothing works but reboot.

My solution is reboot. Nothing else helped me. Ubuntu 12.04.4

Hi,
It worked for me. I forcefully unmounted and mounted again.
Thank you.

Thank you! You helped me to fix my problem.

I am facing issue on my NFS filesystem. When i try to access directories it displays unknow erroe

STGDPMWEB1:/shareddata # cd STP
-bash: cd: STP: Unknown error 521
i am using below command to mount… any one can help plsss

mount -t nfs4 NSDLSTAG:/HFS/shareddata /shareddata/

STGDPMWEB1:/shareddata # ls -lart
ls: cannot access uploadfiles: Unknown error 521
ls: cannot access downloadfiles: Unknown error 521
ls: cannot access mail: Unknown error 521
ls: cannot access MessagesExportedFromProjectWeb: Unknown error 521
ls: cannot access STP: Unknown error 521
ls: cannot access abcd: Unknown error 521
total 5
d. ? ? ? ? ? uploadfiles
d. ? ? ? ? ? mail
d. ? ? ? ? ? downloadfiles
-. ? ? ? ? ? abcd
d. ? ? ? ? ? STP
d. ? ? ? ? ? MessagesExportedFromProjectWeb
drwxr-xr-x 28 root root 4096 Feb 3 11:35 ..
drwxrwxrwx 7 root bin 512 Feb 3 13:21 .

You need to reboot. this resolv the issue of files with . ? ? ? ? ?

So much misinformation in this thread! It is not true at all that rebooting the NFS server should lead to stale file handles. Indeed, NFS was designed to be a stateless protocol where the server could be rebooted, or indeed migrated to a different host, without interruption to clients (besides a delay in response to NFS requests while the server is down).

If this is not happening then your NFS configuration is broken. There are numerous ways this breakage can happen on the Linux NFS server. One way is if you do not specify the fsid option in your /etc/exports, and your NFS server decides to automatically assign a different fsid portion of the file handle after a reboot. Whether this will be a problem depends on your configuration.

Another way things can break is if you start up the NFS daemon on the NFS server and make it available on the network before having exported all filesystems. It’s crucial when using NFS in a HA environment that the virtual HA IP address is only added to a server after an NFS failover once all exports have been loaded on the server.

About the only situation in a correctly configured NFS environment where you will get stale NFS file handle and have to remount filesystems on the client is if the server was restored from a filesystem-level (not block-level) backup, leading to files having different inode numbers and therefore different NFS file handles. There is no simple way around this issue.

This just touches the surface of some of the possible issues that could lead to the type of problems OP describes. Having to unmount then remount filesystems is *not* the expected behaviour of a correctly configured NFS environment.

I´ve found these error started when mounting the nfs share on an AIX machine, and created a directory. just rebooted and worked again.

I had same problem.

“df: `/mnt/nfs_share_name’: Stale NFS file handle”

But was not possible to me reboot the server.

The solution is unmounting all Stale NFS fila handle shares and then mount again.

But “mount -t nfs nfsserver:/path/to/nfs/share /mnt/destination” was giving me this error:
“mount.nfs: Stale NFS file handle”

After this, i mount nfs share with next command:
“mount -t nfs -o ro,soft,rsize=32768,wsize=32768 NFS_HOST:/path_to_nfs_share /mnt/mount_destination”

In my case, i use mounts with read only permissions and don’t want them to mount on boot, so there is no reference to it on /etc/fstab.

If you need to read and write, you have to use “rw” instead “ro”.

Источник