direct_io versus mmap()

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

direct_io versus mmap()

sage weil
Hi all,

I've run into trouble with the mmap() versus direct_io thing: I want them
both!  From what I gather it's not practical to actually do mmap()
correctly with a userspace file system without using the VFS page cache
(and thus not using direct_io), but I'm hoping it's possible to come up
with some sort of compromise.

Basically, I need direct_io because I'm working on a distributed file
system and need to be able to intelligently manage buffer cache
consistency across multiple clients.  Because FUSE doesn't let me force
write-thru in the VFS page cache, or to force the kernel to flush dirty
buffers, or selectively disallow caching on a per-file basis, I need to
put my buffer cache in userspace (and turn off the kernel's page cache).

At the same time, I need mmap() because I want to be able to use the file
system to execute programs and run gcc and other normalish things that
break without mmap() (even just read-only mmap()).

Although there are apparently some fundamental problems with doing a
robust mmap() on a userspace fs "right," I think I only really need
read-only mmap() to get by, and would be willing to cut corners with
consistency.  Like, if mmap() used the page cache, but regular file I/O
didn't (i.e. direct_io excluded mmap()), that would probably be fine.  In
one thread someone suggested just disabling the DIRECT_IO flag check on
mmap() might work, but it didn't seem to do the trick for me (w/ 2.3.0).

Also, now that I think about it, I remember somebody mentioning that you
can somehow enable/disable direct_io on a per-file basis... is that true?
If so, is there a way to tell from the userspace side of things whether a
file is being mmap()'d or not (and thus whether we can safely enable
direct_io)?

It doesn't seem like it'd be that complicated to do some sort of
usually_direct_io type mode that still allowed mmap(), but I'm not
familiar enough with the FUSE stuff to really know.  Is it possible?
Difficult?  Or should I be approaching this from some other direction?

Thanks so much!
sage



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> I've run into trouble with the mmap() versus direct_io thing: I want them
> both!  From what I gather it's not practical to actually do mmap()
> correctly with a userspace file system without using the VFS page cache
> (and thus not using direct_io), but I'm hoping it's possible to come up
> with some sort of compromise.
>
> Basically, I need direct_io because I'm working on a distributed file
> system and need to be able to intelligently manage buffer cache
> consistency across multiple clients.  Because FUSE doesn't let me force
> write-thru in the VFS page cache, or to force the kernel to flush dirty
> buffers, or selectively disallow caching on a per-file basis, I need to
> put my buffer cache in userspace (and turn off the kernel's page cache).

FUSE always does write-through.  No dirty buffers ever accumulate.

> At the same time, I need mmap() because I want to be able to use the file
> system to execute programs and run gcc and other normalish things that
> break without mmap() (even just read-only mmap()).
>
> Although there are apparently some fundamental problems with doing a
> robust mmap() on a userspace fs "right," I think I only really need
> read-only mmap() to get by, and would be willing to cut corners with
> consistency.  Like, if mmap() used the page cache, but regular file I/O
> didn't (i.e. direct_io excluded mmap()), that would probably be fine.  In
> one thread someone suggested just disabling the DIRECT_IO flag check on
> mmap() might work, but it didn't seem to do the trick for me (w/ 2.3.0).
>
> Also, now that I think about it, I remember somebody mentioning that you
> can somehow enable/disable direct_io on a per-file basis... is that true?

That is the plan.  Next release will have this.

> If so, is there a way to tell from the userspace side of things whether a
> file is being mmap()'d or not (and thus whether we can safely enable
> direct_io)?
>
> It doesn't seem like it'd be that complicated to do some sort of
> usually_direct_io type mode that still allowed mmap(), but I'm not
> familiar enough with the FUSE stuff to really know.  Is it possible?
> Difficult?  Or should I be approaching this from some other direction?

I think you should.  Do you know when the cache needs to be
invalidated?  Currenty you can do that by opening and closing the file
you want the cache to be purged for.

Miklos


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Thu, 28 Jul 2005, Miklos Szeredi wrote:
> FUSE always does write-through.  No dirty buffers ever accumulate.

Ok, that avoids half the problem...

>> Also, now that I think about it, I remember somebody mentioning that you
>> can somehow enable/disable direct_io on a per-file basis... is that true?
>
> That is the plan.  Next release will have this.

Okay.  Although in order for this to fully solve my problem there's need
to be a way to tell if a given file is being mmap()'d, and to selectively
disable it.  Doesn't sound very elegant..

> I think you should.  Do you know when the cache needs to be
> invalidated?  Currenty you can do that by opening and closing the file
> you want the cache to be purged for.

Are you suggesting that the FUSE user process open and close the file to
kick the kernel?  And that would really flush pages even though another
process has hte file open the whole time?

It's not so much that I need to periodically purge all pages, it's that I
need to force all reads to be synchronous for some indefinite period.
When processes on different nodes have a file open for both reading and
writing, all reads and writes have to go to the server to get correct
behavior.  I supposed after every read operation completes I could kick
the kernel into purging pages...?

sage





-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> >> Also, now that I think about it, I remember somebody mentioning that you
> >> can somehow enable/disable direct_io on a per-file basis... is that true?
> >
> > That is the plan.  Next release will have this.
>
> Okay.  Although in order for this to fully solve my problem there's need
> to be a way to tell if a given file is being mmap()'d, and to selectively
> disable it.  Doesn't sound very elegant..

No, it doesn't.

> > I think you should.  Do you know when the cache needs to be
> > invalidated?  Currenty you can do that by opening and closing the file
> > you want the cache to be purged for.
>
> Are you suggesting that the FUSE user process open and close the file to
> kick the kernel?  And that would really flush pages even though another
> process has hte file open the whole time?
>
> It's not so much that I need to periodically purge all pages, it's that I
> need to force all reads to be synchronous for some indefinite period.
> When processes on different nodes have a file open for both reading and
> writing, all reads and writes have to go to the server to get correct
> behavior.  I supposed after every read operation completes I could kick
> the kernel into purging pages...?

No, what I meant, was you could have a file change notification from
the server to all clients having the file open for reading.  Then
these clients could do a cache flush.  Wouldn't that work?

Miklos


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Thu, 28 Jul 2005, Miklos Szeredi wrote:
> No, what I meant, was you could have a file change notification from
> the server to all clients having the file open for reading.  Then
> these clients could do a cache flush.  Wouldn't that work?

Almost, a 'file change' notification won't work because of the delay.
During that period new data may have been written but the reader may still
be using cached data from the page cache.  To get real consistency, caches
need to be purged and caching disabled _before_ writing starts, and then
caching needs to stay disabled (with reads and writes synchronous) until
there's no longer a mix of readers/writers.

Even if my FUSE module could say "purge pages, now!", it would need to do
that after every read in order to effectively disable caching.  direct_io
is a more graceful way to accomplish that, but then I lose mmap().  I'm
willing to fudge the consistency to make mmap() work, since in practice
modification of files that are being executed doesn't really happen.  But
I want proper consistency the rest of the time.

So actually, if the per-file direct_io in the next FUSE version will let
you turn on/off direct_io for open files at will (i.e. at any random point
after the file is already open, via some callback mechanism, upon
revocation of caching capability by the server), then I think that would
solve my problem--that's exactly what the userspace buffer cache is
currently doing.  mmap() would work normally unless for some reason
another node tried to write to the file and I have to enable direct_io on
the file... which shouldn't happen under normal workloads.

Is that how the per-file direct_io thing is going to work?  Via a
callback of some sort?

sage



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> Almost, a 'file change' notification won't work because of the delay.
> During that period new data may have been written but the reader may still
> be using cached data from the page cache.  To get real consistency, caches
> need to be purged and caching disabled _before_ writing starts, and then
> caching needs to stay disabled (with reads and writes synchronous) until
> there's no longer a mix of readers/writers.

I don't think that disabling the cache gives you any more consistency
guarantees in "time topology".  So it's a quantitative improvement
over cache flushing, rather than a qualitative.  Am I missing
something?

> Even if my FUSE module could say "purge pages, now!", it would need to do
> that after every read in order to effectively disable caching.  direct_io
> is a more graceful way to accomplish that, but then I lose mmap().  I'm
> willing to fudge the consistency to make mmap() work, since in practice
> modification of files that are being executed doesn't really happen.  But
> I want proper consistency the rest of the time.
>
> So actually, if the per-file direct_io in the next FUSE version will let
> you turn on/off direct_io for open files at will (i.e. at any random point
> after the file is already open, via some callback mechanism, upon
> revocation of caching capability by the server), then I think that would
> solve my problem--that's exactly what the userspace buffer cache is
> currently doing.  mmap() would work normally unless for some reason
> another node tried to write to the file and I have to enable direct_io on
> the file... which shouldn't happen under normal workloads.
>
> Is that how the per-file direct_io thing is going to work?  Via a
> callback of some sort?

No.  It will be a flag returned from the OPEN request.  So you can't
change it while the file is open.

Miklos


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Fri, 29 Jul 2005, Miklos Szeredi wrote:

>> Almost, a 'file change' notification won't work because of the delay.
>> During that period new data may have been written but the reader may still
>> be using cached data from the page cache.  To get real consistency, caches
>> need to be purged and caching disabled _before_ writing starts, and then
>> caching needs to stay disabled (with reads and writes synchronous) until
>> there's no longer a mix of readers/writers.
>
> I don't think that disabling the cache gives you any more consistency
> guarantees in "time topology".  So it's a quantitative improvement
> over cache flushing, rather than a qualitative.  Am I missing
> something?

It's easiest to see if you consider an outside communications channel
(although in reality any metadata operations are "outside" because they
don't involve the page cache).  Say client1 and client2 both have a file
open.  Client1 writes something, and then tells client2 he's done.
Client2 reads it.  If caching is enabled, client2 may read old or new
data, depending on the relative speeds of network links, how quickly 'file
change' messages are processed, etc.  The only way (well, simplest way) to
get correct behavior is to disable caching when there is a mix of readers
and writers (on different nodes).  (Or make the server wait until all
caches are invalidated before acknowledging the write, but that's
abyssmally slow.)  That guarantees that a read that begins after a write
completed will return correct data, which gives you the same behavior you
expect with two processes on the same machine.  (If the read/write calls
overlap it's still ambiguous--also what you expect with POSIX.)

>> Is that how the per-file direct_io thing is going to work?  Via a
>> callback of some sort?
>
> No.  It will be a flag returned from the OPEN request.  So you can't
> change it while the file is open.

Ok.  In that case, I think the easiest approach might be try to tweak
FUSE's mmap() so that it will still work well enough to keep most users
happy when (global) direct_io is enabled.  Somebody mentioned that simply
taking out the DIRECT_IO check in the FUSE mmap function might do the
trick, but that didn't seem to work for me.  Do you think it will be much
more complicated than that?  Or can you point me toward the relevant
functions?

It's not ideal, but ultimately being able to correctly manage consistency
for most files (ones that aren't mmap()'d) is good enough!

Thanks so much!
sage


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> >> Almost, a 'file change' notification won't work because of the delay.
> >> During that period new data may have been written but the reader may still
> >> be using cached data from the page cache.  To get real consistency, caches
> >> need to be purged and caching disabled _before_ writing starts, and then
> >> caching needs to stay disabled (with reads and writes synchronous) until
> >> there's no longer a mix of readers/writers.
> >
> > I don't think that disabling the cache gives you any more consistency
> > guarantees in "time topology".  So it's a quantitative improvement
> > over cache flushing, rather than a qualitative.  Am I missing
> > something?
>
> It's easiest to see if you consider an outside communications channel
> (although in reality any metadata operations are "outside" because they
> don't involve the page cache).  Say client1 and client2 both have a file
> open.  Client1 writes something, and then tells client2 he's done.
> Client2 reads it.  If caching is enabled, client2 may read old or new
> data, depending on the relative speeds of network links, how quickly 'file
> change' messages are processed, etc.  The only way (well, simplest way) to
> get correct behavior is to disable caching when there is a mix of readers
> and writers (on different nodes).  (Or make the server wait until all
> caches are invalidated before acknowledging the write, but that's
> abyssmally slow.)  That guarantees that a read that begins after a write
> completed will return correct data, which gives you the same behavior you
> expect with two processes on the same machine.  (If the read/write calls
> overlap it's still ambiguous--also what you expect with POSIX.)

OK, I see the problem better now.

> >> Is that how the per-file direct_io thing is going to work?  Via a
> >> callback of some sort?
> >
> > No.  It will be a flag returned from the OPEN request.  So you can't
> > change it while the file is open.
>
> Ok.  In that case, I think the easiest approach might be try to tweak
> FUSE's mmap() so that it will still work well enough to keep most users
> happy when (global) direct_io is enabled.  Somebody mentioned that simply
> taking out the DIRECT_IO check in the FUSE mmap function might do the
> trick, but that didn't seem to work for me.  Do you think it will be much
> more complicated than that?  Or can you point me toward the relevant
> functions?

Well, removing the check _should_ work.  I can't see why it doesn't.
There's no other difference between the two modes of operation in the
mmap path.

> It's not ideal, but ultimately being able to correctly manage consistency
> for most files (ones that aren't mmap()'d) is good enough!

It would be nice to keep the local consistency that the page cache
gives you for memory maps.  A "read-through" mode would do it nicely.
Not sure if it's worth the effort though.

Miklos


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Fri, 29 Jul 2005, Miklos Szeredi wrote:
> Well, removing the check _should_ work.  I can't see why it doesn't.
> There's no other difference between the two modes of operation in the
> mmap path.

I commented out the DIRECT_IO check in fuse_file_mmap() in kernel/file.c
(that's the check you mean, right?).  Here's what I see when I try to
execute something:

unique: 6483, opcode: LOOKUP (1), nodeid: 1, insize: 48
LOOKUP /fakesyn
    NODEID: 2
    unique: 6483, error: 0 (Success), outsize: 136
unique: 6484, opcode: OPEN (14), nodeid: 2, insize: 48
OPEN[11] flags: 0x0
    unique: 6484, error: 0 (Success), outsize: 32
unique: 6485, opcode: RELEASE (18), nodeid: 2, insize: 56
RELEASE[11] flags: 0x0
unique: 6486, opcode: OPEN (14), nodeid: 2, insize: 48
OPEN[12] flags: 0x8000
    unique: 6486, error: 0 (Success), outsize: 32
unique: 6487, opcode: READ (15), nodeid: 2, insize: 64
READ[12] 80 bytes from 0
    READ[12] 80 bytes
    unique: 6487, error: 0 (Success), outsize: 96
unique: 6488, opcode: RELEASE (18), nodeid: 2, insize: 56
RELEASE[12] flags: 0x8000
    unique: 6485, error: 0 (Success), outsize: 16
    unique: 6488, error: 0 (Success), outsize: 16

and in my shell,

# mnt/tester
-bash: mnt/tester: Bad address
# strace -f mnt/tester
strace: exec: Bad address
execve("mnt/tester", ["mnt/tester"], [/* 25 vars */]) = 0

I thought the slow return of the first close() might be the issue, but
I get basically the same thing with -s:

unique: 6484, opcode: OPEN (14), nodeid: 2, insize: 48
OPEN[11] flags: 0x0
    unique: 6484, error: 0 (Success), outsize: 32
unique: 6485, opcode: RELEASE (18), nodeid: 2, insize: 56
RELEASE[11] flags: 0x0
    unique: 6485, error: 0 (Success), outsize: 16
unique: 6486, opcode: OPEN (14), nodeid: 2, insize: 48
OPEN[12] flags: 0x8000
    unique: 6486, error: 0 (Success), outsize: 32
unique: 6487, opcode: READ (15), nodeid: 2, insize: 64
READ[12] 80 bytes from 0
    READ[12] 80 bytes
    unique: 6487, error: 0 (Success), outsize: 96
unique: 6488, opcode: RELEASE (18), nodeid: 2, insize: 56
RELEASE[12] flags: 0x8000
    unique: 6488, error: 0 (Success), outsize: 16

sage




>
>> It's not ideal, but ultimately being able to correctly manage consistency
>> for most files (ones that aren't mmap()'d) is good enough!
>
> It would be nice to keep the local consistency that the page cache
> gives you for memory maps.  A "read-through" mode would do it nicely.
> Not sure if it's worth the effort though.
>
> Miklos
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> fuse-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/fuse-devel
>


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> I commented out the DIRECT_IO check in fuse_file_mmap() in kernel/file.c
> (that's the check you mean, right?).  Here's what I see when I try to
> execute something:

Did you do 'rmmod fuse; modprobe fuse'?

Miklos


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Fri, 29 Jul 2005, Miklos Szeredi wrote:
>> I commented out the DIRECT_IO check in fuse_file_mmap() in kernel/file.c
>> (that's the check you mean, right?).  Here's what I see when I try to
>> execute something:
>
> Did you do 'rmmod fuse; modprobe fuse'?

Yeah.  Actually, after putting in some printk's, it looks like
fuse_file_mmap() isn't being called at all when direct_io is enabled
(nothing printed).  With direct_io off, I get 1 2 3 (as expected).



static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma)
{
  struct inode *inode = file->f_dentry->d_inode;
  struct fuse_conn *fc = get_fuse_conn(inode);

  printk("mmap 1\n");
  //if (fc->flags & FUSE_DIRECT_IO)
  //   return -ENODEV;
  printk("mmap 2\n");
  if ((vma->vm_flags & VM_SHARED)) {
   if ((vma->vm_flags & VM_WRITE)) {
  return -ENODEV;
  printk("flags & VM_WRITE\n");
   } else
  vma->vm_flags &= ~VM_MAYWRITE;
  }
  printk("mmap 3\n");
  return generic_file_mmap(file, vma);
}


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> >
> > Did you do 'rmmod fuse; modprobe fuse'?
>
> Yeah.  Actually, after putting in some printk's, it looks like
> fuse_file_mmap() isn't being called at all when direct_io is enabled
> (nothing printed).  With direct_io off, I get 1 2 3 (as expected).

And which version of FUSE is it?  In CVS this has changed, but then
the if(FUSE_DIRECT_IO) thing wouldn't be there.

I'm totally confused.

Miklos


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
On Fri, 29 Jul 2005, Miklos Szeredi wrote:

>>> Did you do 'rmmod fuse; modprobe fuse'?
>>
>> Yeah.  Actually, after putting in some printk's, it looks like
>> fuse_file_mmap() isn't being called at all when direct_io is enabled
>> (nothing printed).  With direct_io off, I get 1 2 3 (as expected).
>
> And which version of FUSE is it?  In CVS this has changed, but then
> the if(FUSE_DIRECT_IO) thing wouldn't be there.
>
> I'm totally confused.

2.3.0.

vapre:fuse-2.3.0 12:19 PM $ grep -rn DIRECT_IO .
./kernel/fuse_i.h:102:#define FUSE_DIRECT_IO           (1 << 3)
./kernel/file.c:592:    if (fc->flags & FUSE_DIRECT_IO)
./kernel/file.c:618:    if (fc->flags & FUSE_DIRECT_IO) {
./kernel/file.c:636:    //if (fc->flags & FUSE_DIRECT_IO)
./kernel/file.c:656:    if (fc->flags & FUSE_DIRECT_IO)
./kernel/inode.c:328:   OPT_DIRECT_IO,
./kernel/inode.c:344:   {OPT_DIRECT_IO,                 "direct_io"},
./kernel/inode.c:406:           case OPT_DIRECT_IO:
./kernel/inode.c:407:                   d->flags |= FUSE_DIRECT_IO;
./kernel/inode.c:442:   if (fc->flags & FUSE_DIRECT_IO)

Maybe the problem is that fuse_file_read does the direct_io read instead
of generic_file_read (line ~592), and subsequently doesn't populate the
page cache?

Should I try pulling the latest from CVS instead?

sage



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> Maybe the problem is that fuse_file_read does the direct_io read instead
> of generic_file_read (line ~592), and subsequently doesn't populate the
> page cache?

No, mmap (or rather memory access after mmap) will populate the page
cache via the address_space_operations::readpage() method.

Finally got over my extreme lazyness and actually tried out this
thing.  And to my utter amazement, it really didn't work ;)

After some investigation it turns out, that it fails on the
!current->mm check in fuse_get_user_pages().

So it seems it's not yet doing any memory mapping, but some sort of
tricky read, that fuse_direct_io() can't handle (so in a way you were
right).

The following patch should implement a sort of read-through behavior:
each read will refresh all pages even if they were previously cached.
Not really tested (it compiles, and didn't crash the kernel
immediately).

Is this something like what you need?

Miklos

Index: file.c
===================================================================
RCS file: /cvsroot/fuse/fuse/kernel/file.c,v
retrieving revision 1.75
diff -u -r1.75 file.c
--- file.c 12 May 2005 14:56:34 -0000 1.75
+++ file.c 30 Jul 2005 09:20:12 -0000
@@ -583,6 +583,24 @@
  return res;
 }
 
+static void clear_pages_uptodate(struct address_space *mapping, off_t pos,
+ size_t count)
+{
+ pgoff_t start = pos >> PAGE_CACHE_SHIFT;
+ pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT;
+ pgoff_t i;
+
+ for (i = start; i <= end; i++) {
+ struct page *page = find_get_page(mapping, i);
+ if (page) {
+ lock_page(page);
+ ClearPageUptodate(page);
+ unlock_page(page);
+ page_cache_release(page);
+ }
+ }
+}
+
 static ssize_t fuse_file_read(struct file *file, char __user *buf,
       size_t count, loff_t *ppos)
 {
@@ -604,8 +622,10 @@
  return generic_file_read(file, buf, count, ppos);
  }
 #else
- else
+ else {
+ clear_pages_uptodate(inode->i_mapping, *ppos, count);
  return generic_file_read(file, buf, count, ppos);
+ }
 #endif
 }
 


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

sage weil
Ah, yes, I think that does the trick!  The only downside is that I'll eat
memory filling page cache pages that will never be used (except by mmap).
Not sure how to elegantly avoid that.

This solves our problem well enough for the time being, though.  Thanks so
much!

sage



On Sat, 30 Jul 2005, Miklos Szeredi wrote:

>> Maybe the problem is that fuse_file_read does the direct_io read instead
>> of generic_file_read (line ~592), and subsequently doesn't populate the
>> page cache?
>
> No, mmap (or rather memory access after mmap) will populate the page
> cache via the address_space_operations::readpage() method.
>
> Finally got over my extreme lazyness and actually tried out this
> thing.  And to my utter amazement, it really didn't work ;)
>
> After some investigation it turns out, that it fails on the
> !current->mm check in fuse_get_user_pages().
>
> So it seems it's not yet doing any memory mapping, but some sort of
> tricky read, that fuse_direct_io() can't handle (so in a way you were
> right).
>
> The following patch should implement a sort of read-through behavior:
> each read will refresh all pages even if they were previously cached.
> Not really tested (it compiles, and didn't crash the kernel
> immediately).
>
> Is this something like what you need?
>
> Miklos
>
> Index: file.c
> ===================================================================
> RCS file: /cvsroot/fuse/fuse/kernel/file.c,v
> retrieving revision 1.75
> diff -u -r1.75 file.c
> --- file.c 12 May 2005 14:56:34 -0000 1.75
> +++ file.c 30 Jul 2005 09:20:12 -0000
> @@ -583,6 +583,24 @@
> return res;
> }
>
> +static void clear_pages_uptodate(struct address_space *mapping, off_t pos,
> + size_t count)
> +{
> + pgoff_t start = pos >> PAGE_CACHE_SHIFT;
> + pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT;
> + pgoff_t i;
> +
> + for (i = start; i <= end; i++) {
> + struct page *page = find_get_page(mapping, i);
> + if (page) {
> + lock_page(page);
> + ClearPageUptodate(page);
> + unlock_page(page);
> + page_cache_release(page);
> + }
> + }
> +}
> +
> static ssize_t fuse_file_read(struct file *file, char __user *buf,
>      size_t count, loff_t *ppos)
> {
> @@ -604,8 +622,10 @@
> return generic_file_read(file, buf, count, ppos);
> }
> #else
> - else
> + else {
> + clear_pages_uptodate(inode->i_mapping, *ppos, count);
> return generic_file_read(file, buf, count, ppos);
> + }
> #endif
> }
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> fuse-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/fuse-devel
>


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: direct_io versus mmap()

Miklos Szeredi
> Ah, yes, I think that does the trick!  The only downside is that I'll eat
> memory filling page cache pages that will never be used (except by mmap).
> Not sure how to elegantly avoid that.

We could try to free the pages after the read.  There's a function
that does this in mm/truncate.c: invalidate_mapping_pages().
Unfortunately it's not exported to modules (and neither the pagevec_xx
functions it uses), so it would need to be reimplemented.  Not
terribly difficult though ;)

Miklos


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel