Quantcast

Using read_buf correctly with more than one fuse_bufs

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Using read_buf correctly with more than one fuse_bufs

Alberto Miranda
Hi,

I've been taking a look to the read_buf interface while developing a
RAM-based caching filesystem and I'm not sure what the correct procedure
is to return multiple buffers through the 'struct fuse_bufvec** bufp'
passed to read_buf.

In our case, the data for a file may be stored in RAM in potentially
separate regions, which might need to be returned by a single read_buf()
invocation. From what I understood from the code and the documentation,
it seems that, first of all, a 'struct fuse_bufvec' should be
malloc()-ed with enough space to contain several 'fuse_bufs', that
should be appropriately initialized to describe these regions.

For instance, if we had two memory regions 'data0' and 'data1', would it
be ok to initialize the fuse_bufs in the following manner? (src in this
case is a struct fuse_bufvec *):

             src->buf[0].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
             src->buf[0].mem = data0; // pointer to internal in-RAM
cache entry
             src->buf[0].size = bytes0;

             src->buf[1].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
             src->buf[1].mem = data1; // pointer to internal in-RAM
cache entry
             src->buf[1].size = bytes1;

My confusion comes from the following description of read_buf (fuse.h):

  * The buffer must be allocated dynamically and stored at the
  * location pointed to by bufp.  If the buffer contains memory
  * regions, they too must be allocated using malloc().  The
  * allocated memory will be freed by the caller.

This seems to imply that the contents of 'data0' and 'data1' should be
memcopied to two dynamically allocated memory regions, and that the
addresses of these two new regions should be used to initialize
buf[0].mem and buf[1].mem, rather than directly using the data that is
already in RAM. Is this the case? If so, why is this extra copy needed?

Thanks in advance,

     alberto

--
Alberto Miranda, PhD
Researcher on HPC I/O
Barcelona Supercomputing Center
www   : https://www.bsc.es/research-development/research-areas/big-data/high-performance-io
email : alberto.miranda(at)bsc.es
phone : (+34) 93 405 42 81


http://bsc.es/disclaimer

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Using read_buf correctly with more than one fuse_bufs

Antonio SJ Musumeci
My guess is convenience in creation of the API. Rather than provide a way for the user to manage the memory the API expects to be given ownership of it (and that it come from malloc).

There are a few locations in libfuse like that. readdir is another location where it'd be nice to control the underlying buffer.

One solution to the problem here could be to add a free function pointer which the caller must provide but the library calls.

On Tue, Apr 4, 2017 at 11:50 AM, Alberto Miranda <[hidden email]> wrote:
Hi,

I've been taking a look to the read_buf interface while developing a
RAM-based caching filesystem and I'm not sure what the correct procedure
is to return multiple buffers through the 'struct fuse_bufvec** bufp'
passed to read_buf.

In our case, the data for a file may be stored in RAM in potentially
separate regions, which might need to be returned by a single read_buf()
invocation. From what I understood from the code and the documentation,
it seems that, first of all, a 'struct fuse_bufvec' should be
malloc()-ed with enough space to contain several 'fuse_bufs', that
should be appropriately initialized to describe these regions.

For instance, if we had two memory regions 'data0' and 'data1', would it
be ok to initialize the fuse_bufs in the following manner? (src in this
case is a struct fuse_bufvec *):

             src->buf[0].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
             src->buf[0].mem = data0; // pointer to internal in-RAM
cache entry
             src->buf[0].size = bytes0;

             src->buf[1].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
             src->buf[1].mem = data1; // pointer to internal in-RAM
cache entry
             src->buf[1].size = bytes1;

My confusion comes from the following description of read_buf (fuse.h):

  * The buffer must be allocated dynamically and stored at the
  * location pointed to by bufp.  If the buffer contains memory
  * regions, they too must be allocated using malloc().  The
  * allocated memory will be freed by the caller.

This seems to imply that the contents of 'data0' and 'data1' should be
memcopied to two dynamically allocated memory regions, and that the
addresses of these two new regions should be used to initialize
buf[0].mem and buf[1].mem, rather than directly using the data that is
already in RAM. Is this the case? If so, why is this extra copy needed?

Thanks in advance,

     alberto

--
Alberto Miranda, PhD
Researcher on HPC I/O
Barcelona Supercomputing Center
www   : https://www.bsc.es/research-development/research-areas/big-data/high-performance-io
email : alberto.miranda(at)bsc.es
phone : <a href="tel:%28%2B34%29%2093%20405%2042%2081" value="+34934054281">(+34) 93 405 42 81


http://bsc.es/disclaimer

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Using read_buf correctly with more than one fuse_bufs

Nikolaus Rath
In reply to this post by Alberto Miranda
On Apr 04 2017, Alberto Miranda <[hidden email]> wrote:

> Hi,
>
> I've been taking a look to the read_buf interface while developing a
> RAM-based caching filesystem and I'm not sure what the correct procedure
> is to return multiple buffers through the 'struct fuse_bufvec** bufp'
> passed to read_buf.
>
> In our case, the data for a file may be stored in RAM in potentially
> separate regions, which might need to be returned by a single read_buf()
> invocation. From what I understood from the code and the documentation,
> it seems that, first of all, a 'struct fuse_bufvec' should be
> malloc()-ed with enough space to contain several 'fuse_bufs', that
> should be appropriately initialized to describe these regions.
>
> For instance, if we had two memory regions 'data0' and 'data1', would it
> be ok to initialize the fuse_bufs in the following manner? (src in this
> case is a struct fuse_bufvec *):
>
>              src->buf[0].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
>              src->buf[0].mem = data0; // pointer to internal in-RAM
> cache entry
>              src->buf[0].size = bytes0;
>
>              src->buf[1].flags = (fuse_buf_flags) (~FUSE_BUF_IS_FD);
>              src->buf[1].mem = data1; // pointer to internal in-RAM
> cache entry
>              src->buf[1].size = bytes1;

Yes, this is correct so far. But as you note below...

> My confusion comes from the following description of read_buf (fuse.h):
>
>   * The buffer must be allocated dynamically and stored at the
>   * location pointed to by bufp.  If the buffer contains memory
>   * regions, they too must be allocated using malloc().  The
>   * allocated memory will be freed by the caller.

... this means that libfuse will call free(data0) and free(data1) after
your read_buf() function returns.


> This seems to imply that the contents of 'data0' and 'data1' should be
> memcopied to two dynamically allocated memory regions, and that the
> addresses of these two new regions should be used to initialize
> buf[0].mem and buf[1].mem, rather than directly using the data that is
> already in RAM. Is this the case?

Yes.

> If so, why is this extra copy needed?

In your case it is not needed. But libfuse has to take into account that
there may also be users where the buffer *has* to be dynamically
allocated and must be freed afterwards. Since the buffer is used by the
caller of reply_buf(), the caller must also be the one freeing it.

In principle, it would be possible to add a flag that tells the caller
whether to free the buffer or not. But no one has done the work.


A different solution is to use the low-level API. In that case, you
explicitly call fuse_reply_buf(), which sends the buffer without copying
(and without freeing). Afterwards, control returns to your read()
handler and you can free (or not free) the buffer as you desire.


Hope that helps!

-Nikolaus


--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Loading...