[cgl_discussion] Re: OSDL CGL-WG draft specs available for review
Mark Huth
mark.huth at mvista.com
Thu Apr 24 11:24:36 PDT 2003
FWIW -
I looked at Tigran's implementation, and it was a decent job - the
algorithm used was to install a filesystem to handle request to the
unmounted file system by returning an error. He then looped through the
task list, removing file descriptors that references the subject file
system, removing working directory references and removing mmaps. At
the end, the file system should have the correct reference count for
unmounting. The main issue is that I was unclear if new request could
occur after the list was walked, potentially causing the unmount not to
occur. The other issue is scalability with large numbers of tasks, perhaps.
My implementation is a bottom up one, with the unmount process first
"walling off" system calls that could create new references. The mount
point is marked as fumount pending, and subsequent syscalls that would
use a file descriptor would have the fget fail with a NULL file object,
leading the calls to return -EBADF. If the syscall used a file name
string, then the namei lookups would fail, generally returning -ENXIO.
Then there is a pause, the intention of which is to allow contexts in
the kernel to clear out - such a pending reads. Following that, the
file objects are marked as subject to fumount, and then outstanding
references to the file objects are cancelled - locks are removed and
mmaps deleted. If there remain outstanding file references, the file
object is cloned, leaving the old object with null operations. Following
that, only a close will succeed. The cloned object is taken over by the
umount process, and closed however many times required to drive the
reference count to 0. Once the super block file list is emptied, the
mount reference count is checked. If still not at the magic number,
then the task list is walked, looking for cwd entries that are
subordinate to the fs mount point. If found, the task is given a NULL
cwd. Various parts of the lookup routines have had error handling added
so that the NULL cwd entry does not cause problems. After that, the
mount ref count is checked again. If still not the magic number, then
an arbitrary mntput is done, potentailly losing resources. However,
that has not happened in our testing. It's complicated, but seems solid
after our testing. There are a couple of rules that the sysadmin should
follow: Any nfs export stuff subordinate to the mount point should be
unexported before fumount - I can't find non-process related kernel
references, and NFS is the only entity that might pose a problem there -
it's stateless, so the problem is transitory, but nonetheless, I
recommend removing the export. The second rule is that if the mount
point has subordinate mounts, these must be removed first by the admin -
I chose not to allow the unmount to automatically recurse. Finally, the
filesystem cannot be the root filesystem, although that is arbitrary as
far as the code is concerned - fumount checks for it and doesn't do the
forced unmount if the subject is /
This was messy and took a while to get right, primarily due to the
Linux's ill-defined locking paradigm. Lists of objects are locked,
while the objects themselves only have reference counts. I was unaware
of Tigran's implementation at the start of the project. Looking at it,
there are some things I really like, so a combination of the two might
be the best implementation.
Mark Huth
Carl-Daniel Hailfinger wrote:
>[CC:ed Mark Huth and Tigran Aivazian]
>
>Christoph Hellwig wrote:
>
>
>> 4.10 Force unmount (2) 2 Experimental Availability Core
>> 4.10 Description:
>>
>> CGL shall support forced unmounting of a filesystem.
>> * The unmount should work even if there are open files or processes
>> in the file system.
>> * Pending requests should be ended with an error return when the
>> file system is unmounted.
>>
>>
>>This is very hard to get right. What the expermintel implementation
>>you're referring to?
>>
>>
>
>IIRC, Mark Huth from MontaVista and Tigran Aivazian from Veritas both
>developed such an implementation independently of each other.
>Maybe they can offer some insight.
>
>Regards,
>Carl-Daniel
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.linux-foundation.org/pipermail/cgl_discussion/attachments/20030424/b25258ac/attachment-0001.htm
More information about the cgl_discussion
mailing list