Attaching a loopback device to a direct_io enabled fuse file hangs

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Attaching a loopback device to a direct_io enabled fuse file hangs

Sheng Yang
Hi,

I just got a weird behavior regarding using a loopback device to
attach a file in fuse. Everything works fine if direct_io for fuse
wasn't enabled. And as soon as direct_io enabled, loopback device will
hang.

The direct reason is in the direct io path(fuse_direct_io()), even the
first read request by loopback device will hang because the page
shared with userspace cannot be released, which result in this later:

[  480.068490] INFO: task loop0:18115 blocked for more than 120 seconds.
[  480.071210]       Not tainted 4.4.13+ #10
[  480.072970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  480.076325] loop0           D ffff8800ba2ab918 13808 18115      2 0x00000000
[  480.079676]  ffff8800ba2ab918 ffff8800b89e8770 ffffffff8299a480
ffff88013fc95f18
[  480.083386]  ffff88013ab75200 ffff8800b89e8000 ffff8800ba2ac000
ffff88013fc95f00
[  480.086541]  7fffffffffffffff ffff8800ba2abaa0 ffffffff81a730b0
ffff8800ba2ab930
[  480.089602] Call Trace:
[  480.091154]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
[  480.093370]  [<ffffffff81a728e7>] schedule+0x37/0x80
[  480.095621]  [<ffffffff81a76ded>] schedule_timeout+0x25d/0x360
[  480.097789]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
[  480.100015]  [<ffffffff81049615>] ? kvm_clock_get_cycles+0x25/0x30
[  480.102216]  [<ffffffff810d34a0>] ? ktime_get+0x90/0x110
[  480.104143]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
[  480.106319]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
[  480.107979]  [<ffffffff81a71c5f>] io_schedule_timeout+0x9f/0x110
[  480.109892]  [<ffffffff81a730c6>] bit_wait_io+0x16/0x60
[  480.111605]  [<ffffffff81a72eb9>] __wait_on_bit_lock+0x49/0xa0
[  480.113469]  [<ffffffff810b99bd>] ? vprintk_emit+0x2fd/0x560
[  480.115312]  [<ffffffff81143847>] __lock_page+0xa7/0xb0
[  480.117010]  [<ffffffff8109ef30>] ? autoremove_wake_function+0x30/0x30
[  480.118974]  [<ffffffff8114eb3c>] set_page_dirty_lock+0x4c/0x50
[  480.120757]  [<ffffffff812f6e47>] fuse_release_user_pages.isra.19+0x47/0x60
[  480.122624]  [<ffffffff812f95a1>] fuse_direct_io+0x281/0x5d0
[  480.124357]  [<ffffffff812f992f>] __fuse_direct_read+0x3f/0x60
[  480.126074]  [<ffffffff812f9985>] fuse_direct_read_iter+0x35/0x40
[  480.127190]  [<ffffffff811a95fd>] vfs_iter_read+0x5d/0x90
[  480.128080]  [<ffffffff815b49c9>] lo_read_simple.isra.25+0x99/0x1c0
[  480.129057]  [<ffffffff815b5c84>] loop_queue_work+0x654/0x6e0
[  480.130006]  [<ffffffff8107c111>] ? kthread_worker_fn+0x61/0x1a0
[  480.130958]  [<ffffffff8107c19a>] ? kthread_worker_fn+0xea/0x1a0
[  480.131929]  [<ffffffff8107c133>] kthread_worker_fn+0x83/0x1a0
[  480.132875]  [<ffffffff8107c0b0>] ? __init_kthread_worker+0x60/0x60
[  480.133846]  [<ffffffff8107c03a>] kthread+0xea/0x100
[  480.134716]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
[  480.135741]  [<ffffffff81a7884f>] ret_from_fork+0x3f/0x70
[  480.136627]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
[  480.137619] no locks held by loop0/18115.

Of course, everything works without loopback device. And loopback
works with direct_io disabled fuse file.

I am still working on this issue and trying to get to the bottom of
it, but wondering if anyone has idea about what's happened? Cannot
think of a reason why loopback device has anything to do with a page
outside of it's reach.

BTW, I am using a modified the version of libfuse hellofs example for
testing, by add an file backing the read/write request. Can provide
the source code if necessary. I've seen the same behavior using
different backend as well, so I don't think it's specific to my fuse
implementation.

Thanks in advance.

--Sheng

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Miklos Szeredi
On Mon, Aug 08, 2016 at 09:26:15PM -0700, Sheng Yang wrote:

> Hi,
>
> I just got a weird behavior regarding using a loopback device to
> attach a file in fuse. Everything works fine if direct_io for fuse
> wasn't enabled. And as soon as direct_io enabled, loopback device will
> hang.
>
> The direct reason is in the direct io path(fuse_direct_io()), even the
> first read request by loopback device will hang because the page
> shared with userspace cannot be released, which result in this later:
>
> [  480.068490] INFO: task loop0:18115 blocked for more than 120 seconds.
> [  480.071210]       Not tainted 4.4.13+ #10
> [  480.072970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  480.076325] loop0           D ffff8800ba2ab918 13808 18115      2 0x00000000
> [  480.079676]  ffff8800ba2ab918 ffff8800b89e8770 ffffffff8299a480
> ffff88013fc95f18
> [  480.083386]  ffff88013ab75200 ffff8800b89e8000 ffff8800ba2ac000
> ffff88013fc95f00
> [  480.086541]  7fffffffffffffff ffff8800ba2abaa0 ffffffff81a730b0
> ffff8800ba2ab930
> [  480.089602] Call Trace:
> [  480.091154]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
> [  480.093370]  [<ffffffff81a728e7>] schedule+0x37/0x80
> [  480.095621]  [<ffffffff81a76ded>] schedule_timeout+0x25d/0x360
> [  480.097789]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
> [  480.100015]  [<ffffffff81049615>] ? kvm_clock_get_cycles+0x25/0x30
> [  480.102216]  [<ffffffff810d34a0>] ? ktime_get+0x90/0x110
> [  480.104143]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
> [  480.106319]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
> [  480.107979]  [<ffffffff81a71c5f>] io_schedule_timeout+0x9f/0x110
> [  480.109892]  [<ffffffff81a730c6>] bit_wait_io+0x16/0x60
> [  480.111605]  [<ffffffff81a72eb9>] __wait_on_bit_lock+0x49/0xa0
> [  480.113469]  [<ffffffff810b99bd>] ? vprintk_emit+0x2fd/0x560
> [  480.115312]  [<ffffffff81143847>] __lock_page+0xa7/0xb0
> [  480.117010]  [<ffffffff8109ef30>] ? autoremove_wake_function+0x30/0x30
> [  480.118974]  [<ffffffff8114eb3c>] set_page_dirty_lock+0x4c/0x50
> [  480.120757]  [<ffffffff812f6e47>] fuse_release_user_pages.isra.19+0x47/0x60
> [  480.122624]  [<ffffffff812f95a1>] fuse_direct_io+0x281/0x5d0
> [  480.124357]  [<ffffffff812f992f>] __fuse_direct_read+0x3f/0x60
> [  480.126074]  [<ffffffff812f9985>] fuse_direct_read_iter+0x35/0x40
> [  480.127190]  [<ffffffff811a95fd>] vfs_iter_read+0x5d/0x90
> [  480.128080]  [<ffffffff815b49c9>] lo_read_simple.isra.25+0x99/0x1c0
> [  480.129057]  [<ffffffff815b5c84>] loop_queue_work+0x654/0x6e0
> [  480.130006]  [<ffffffff8107c111>] ? kthread_worker_fn+0x61/0x1a0
> [  480.130958]  [<ffffffff8107c19a>] ? kthread_worker_fn+0xea/0x1a0
> [  480.131929]  [<ffffffff8107c133>] kthread_worker_fn+0x83/0x1a0
> [  480.132875]  [<ffffffff8107c0b0>] ? __init_kthread_worker+0x60/0x60
> [  480.133846]  [<ffffffff8107c03a>] kthread+0xea/0x100
> [  480.134716]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
> [  480.135741]  [<ffffffff81a7884f>] ret_from_fork+0x3f/0x70
> [  480.136627]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
> [  480.137619] no locks held by loop0/18115.
>
> Of course, everything works without loopback device. And loopback
> works with direct_io disabled fuse file.
>
> I am still working on this issue and trying to get to the bottom of
> it, but wondering if anyone has idea about what's happened? Cannot
> think of a reason why loopback device has anything to do with a page
> outside of it's reach.

Thanks for the report.

Seems like special handling of ITER_BVEC pages was missed when the *_iter
interfaces were introduced.  This patch copies what fs/direct-io.c does: only
call set_page_dirty_lock() for pages coming from ITER_IOVEC type vector.

I haven't tested it, but pretty sure this will fix the deadlock for you.  Can
you please try?

Thanks,
Miklos

---
 fs/fuse/file.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
  req->out.args[0].size = count;
 }
 
-static void fuse_release_user_pages(struct fuse_req *req, int write)
+static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
 {
  unsigned i;
 
  for (i = 0; i < req->num_pages; i++) {
  struct page *page = req->pages[i];
- if (write)
+ if (should_dirty)
  set_page_dirty_lock(page);
  put_page(page);
  }
@@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
        loff_t *ppos, int flags)
 {
  int write = flags & FUSE_DIO_WRITE;
+ bool should_dirty = !write && iter->type == ITER_IOVEC;
  int cuse = flags & FUSE_DIO_CUSE;
  struct file *file = io->file;
  struct inode *inode = file->f_mapping->host;
@@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
  nres = fuse_send_read(req, io, pos, nbytes, owner);
 
  if (!io->async)
- fuse_release_user_pages(req, !write);
+ fuse_release_user_pages(req, should_dirty);
  if (req->out.h.error) {
  err = req->out.h.error;
  break;

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Ashish Samant

Hi Miklos,

On 08/10/2016 12:25 AM, Miklos Szeredi wrote:

> On Mon, Aug 08, 2016 at 09:26:15PM -0700, Sheng Yang wrote:
>> Hi,
>>
>> I just got a weird behavior regarding using a loopback device to
>> attach a file in fuse. Everything works fine if direct_io for fuse
>> wasn't enabled. And as soon as direct_io enabled, loopback device will
>> hang.
>>
>> The direct reason is in the direct io path(fuse_direct_io()), even the
>> first read request by loopback device will hang because the page
>> shared with userspace cannot be released, which result in this later:
>>
>> [  480.068490] INFO: task loop0:18115 blocked for more than 120 seconds.
>> [  480.071210]       Not tainted 4.4.13+ #10
>> [  480.072970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [  480.076325] loop0           D ffff8800ba2ab918 13808 18115      2 0x00000000
>> [  480.079676]  ffff8800ba2ab918 ffff8800b89e8770 ffffffff8299a480
>> ffff88013fc95f18
>> [  480.083386]  ffff88013ab75200 ffff8800b89e8000 ffff8800ba2ac000
>> ffff88013fc95f00
>> [  480.086541]  7fffffffffffffff ffff8800ba2abaa0 ffffffff81a730b0
>> ffff8800ba2ab930
>> [  480.089602] Call Trace:
>> [  480.091154]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
>> [  480.093370]  [<ffffffff81a728e7>] schedule+0x37/0x80
>> [  480.095621]  [<ffffffff81a76ded>] schedule_timeout+0x25d/0x360
>> [  480.097789]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
>> [  480.100015]  [<ffffffff81049615>] ? kvm_clock_get_cycles+0x25/0x30
>> [  480.102216]  [<ffffffff810d34a0>] ? ktime_get+0x90/0x110
>> [  480.104143]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
>> [  480.106319]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
>> [  480.107979]  [<ffffffff81a71c5f>] io_schedule_timeout+0x9f/0x110
>> [  480.109892]  [<ffffffff81a730c6>] bit_wait_io+0x16/0x60
>> [  480.111605]  [<ffffffff81a72eb9>] __wait_on_bit_lock+0x49/0xa0
>> [  480.113469]  [<ffffffff810b99bd>] ? vprintk_emit+0x2fd/0x560
>> [  480.115312]  [<ffffffff81143847>] __lock_page+0xa7/0xb0
>> [  480.117010]  [<ffffffff8109ef30>] ? autoremove_wake_function+0x30/0x30
>> [  480.118974]  [<ffffffff8114eb3c>] set_page_dirty_lock+0x4c/0x50
>> [  480.120757]  [<ffffffff812f6e47>] fuse_release_user_pages.isra.19+0x47/0x60
>> [  480.122624]  [<ffffffff812f95a1>] fuse_direct_io+0x281/0x5d0
>> [  480.124357]  [<ffffffff812f992f>] __fuse_direct_read+0x3f/0x60
>> [  480.126074]  [<ffffffff812f9985>] fuse_direct_read_iter+0x35/0x40
>> [  480.127190]  [<ffffffff811a95fd>] vfs_iter_read+0x5d/0x90
>> [  480.128080]  [<ffffffff815b49c9>] lo_read_simple.isra.25+0x99/0x1c0
>> [  480.129057]  [<ffffffff815b5c84>] loop_queue_work+0x654/0x6e0
>> [  480.130006]  [<ffffffff8107c111>] ? kthread_worker_fn+0x61/0x1a0
>> [  480.130958]  [<ffffffff8107c19a>] ? kthread_worker_fn+0xea/0x1a0
>> [  480.131929]  [<ffffffff8107c133>] kthread_worker_fn+0x83/0x1a0
>> [  480.132875]  [<ffffffff8107c0b0>] ? __init_kthread_worker+0x60/0x60
>> [  480.133846]  [<ffffffff8107c03a>] kthread+0xea/0x100
>> [  480.134716]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
>> [  480.135741]  [<ffffffff81a7884f>] ret_from_fork+0x3f/0x70
>> [  480.136627]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
>> [  480.137619] no locks held by loop0/18115.
>>
>> Of course, everything works without loopback device. And loopback
>> works with direct_io disabled fuse file.
>>
>> I am still working on this issue and trying to get to the bottom of
>> it, but wondering if anyone has idea about what's happened? Cannot
>> think of a reason why loopback device has anything to do with a page
>> outside of it's reach.
> Thanks for the report.
>
> Seems like special handling of ITER_BVEC pages was missed when the *_iter
> interfaces were introduced.  This patch copies what fs/direct-io.c does: only
> call set_page_dirty_lock() for pages coming from ITER_IOVEC type vector.
>
> I haven't tested it, but pretty sure this will fix the deadlock for you.  Can
> you please try?

This patch looks good.

Tested and Reviewed by : Ashish Samant <[hidden email]>

>
> Thanks,
> Miklos
>
> ---
>   fs/fuse/file.c |    7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
>   req->out.args[0].size = count;
>   }
>  
> -static void fuse_release_user_pages(struct fuse_req *req, int write)
> +static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
>   {
>   unsigned i;
>  
>   for (i = 0; i < req->num_pages; i++) {
>   struct page *page = req->pages[i];
> - if (write)
> + if (should_dirty)
>   set_page_dirty_lock(page);
>   put_page(page);
>   }
> @@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>         loff_t *ppos, int flags)
>   {
>   int write = flags & FUSE_DIO_WRITE;
> + bool should_dirty = !write && iter->type == ITER_IOVEC;
>   int cuse = flags & FUSE_DIO_CUSE;
>   struct file *file = io->file;
>   struct inode *inode = file->f_mapping->host;
> @@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>   nres = fuse_send_read(req, io, pos, nbytes, owner);
>  
>   if (!io->async)
> - fuse_release_user_pages(req, !write);
> + fuse_release_user_pages(req, should_dirty);
>   if (req->out.h.error) {
>   err = req->out.h.error;
>   break;
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning reports. http://sdm.link/zohodev2dev


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Sheng Yang
On Fri, Aug 12, 2016 at 11:00 AM, Ashish Samant
<[hidden email]> wrote:

>
> Hi Miklos,
>
> On 08/10/2016 12:25 AM, Miklos Szeredi wrote:
>> On Mon, Aug 08, 2016 at 09:26:15PM -0700, Sheng Yang wrote:
>>> Hi,
>>>
>>> I just got a weird behavior regarding using a loopback device to
>>> attach a file in fuse. Everything works fine if direct_io for fuse
>>> wasn't enabled. And as soon as direct_io enabled, loopback device will
>>> hang.
>>>
>>> The direct reason is in the direct io path(fuse_direct_io()), even the
>>> first read request by loopback device will hang because the page
>>> shared with userspace cannot be released, which result in this later:
>>>
>>> [  480.068490] INFO: task loop0:18115 blocked for more than 120 seconds.
>>> [  480.071210]       Not tainted 4.4.13+ #10
>>> [  480.072970] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>>> disables this message.
>>> [  480.076325] loop0           D ffff8800ba2ab918 13808 18115      2 0x00000000
>>> [  480.079676]  ffff8800ba2ab918 ffff8800b89e8770 ffffffff8299a480
>>> ffff88013fc95f18
>>> [  480.083386]  ffff88013ab75200 ffff8800b89e8000 ffff8800ba2ac000
>>> ffff88013fc95f00
>>> [  480.086541]  7fffffffffffffff ffff8800ba2abaa0 ffffffff81a730b0
>>> ffff8800ba2ab930
>>> [  480.089602] Call Trace:
>>> [  480.091154]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
>>> [  480.093370]  [<ffffffff81a728e7>] schedule+0x37/0x80
>>> [  480.095621]  [<ffffffff81a76ded>] schedule_timeout+0x25d/0x360
>>> [  480.097789]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
>>> [  480.100015]  [<ffffffff81049615>] ? kvm_clock_get_cycles+0x25/0x30
>>> [  480.102216]  [<ffffffff810d34a0>] ? ktime_get+0x90/0x110
>>> [  480.104143]  [<ffffffff8110f53a>] ? __delayacct_blkio_start+0x1a/0x30
>>> [  480.106319]  [<ffffffff81a730b0>] ? bit_wait+0x60/0x60
>>> [  480.107979]  [<ffffffff81a71c5f>] io_schedule_timeout+0x9f/0x110
>>> [  480.109892]  [<ffffffff81a730c6>] bit_wait_io+0x16/0x60
>>> [  480.111605]  [<ffffffff81a72eb9>] __wait_on_bit_lock+0x49/0xa0
>>> [  480.113469]  [<ffffffff810b99bd>] ? vprintk_emit+0x2fd/0x560
>>> [  480.115312]  [<ffffffff81143847>] __lock_page+0xa7/0xb0
>>> [  480.117010]  [<ffffffff8109ef30>] ? autoremove_wake_function+0x30/0x30
>>> [  480.118974]  [<ffffffff8114eb3c>] set_page_dirty_lock+0x4c/0x50
>>> [  480.120757]  [<ffffffff812f6e47>] fuse_release_user_pages.isra.19+0x47/0x60
>>> [  480.122624]  [<ffffffff812f95a1>] fuse_direct_io+0x281/0x5d0
>>> [  480.124357]  [<ffffffff812f992f>] __fuse_direct_read+0x3f/0x60
>>> [  480.126074]  [<ffffffff812f9985>] fuse_direct_read_iter+0x35/0x40
>>> [  480.127190]  [<ffffffff811a95fd>] vfs_iter_read+0x5d/0x90
>>> [  480.128080]  [<ffffffff815b49c9>] lo_read_simple.isra.25+0x99/0x1c0
>>> [  480.129057]  [<ffffffff815b5c84>] loop_queue_work+0x654/0x6e0
>>> [  480.130006]  [<ffffffff8107c111>] ? kthread_worker_fn+0x61/0x1a0
>>> [  480.130958]  [<ffffffff8107c19a>] ? kthread_worker_fn+0xea/0x1a0
>>> [  480.131929]  [<ffffffff8107c133>] kthread_worker_fn+0x83/0x1a0
>>> [  480.132875]  [<ffffffff8107c0b0>] ? __init_kthread_worker+0x60/0x60
>>> [  480.133846]  [<ffffffff8107c03a>] kthread+0xea/0x100
>>> [  480.134716]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
>>> [  480.135741]  [<ffffffff81a7884f>] ret_from_fork+0x3f/0x70
>>> [  480.136627]  [<ffffffff8107bf50>] ? kthread_create_on_node+0x240/0x240
>>> [  480.137619] no locks held by loop0/18115.
>>>
>>> Of course, everything works without loopback device. And loopback
>>> works with direct_io disabled fuse file.
>>>
>>> I am still working on this issue and trying to get to the bottom of
>>> it, but wondering if anyone has idea about what's happened? Cannot
>>> think of a reason why loopback device has anything to do with a page
>>> outside of it's reach.
>> Thanks for the report.
>>
>> Seems like special handling of ITER_BVEC pages was missed when the *_iter
>> interfaces were introduced.  This patch copies what fs/direct-io.c does: only
>> call set_page_dirty_lock() for pages coming from ITER_IOVEC type vector.
>>
>> I haven't tested it, but pretty sure this will fix the deadlock for you.  Can
>> you please try?
>
> This patch looks good.
>
> Tested and Reviewed by : Ashish Samant <[hidden email]>
>

Hi Miklos,

I just figured it out as well...

And the patch works for me! Thanks!

I am not quite sure what's the impact for stable release though. Seems
quite a few kernels need to be patched.

--Sheng

>>
>> Thanks,
>> Miklos
>>
>> ---
>>   fs/fuse/file.c |    7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> --- a/fs/fuse/file.c
>> +++ b/fs/fuse/file.c
>> @@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
>>       req->out.args[0].size = count;
>>   }
>>
>> -static void fuse_release_user_pages(struct fuse_req *req, int write)
>> +static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
>>   {
>>       unsigned i;
>>
>>       for (i = 0; i < req->num_pages; i++) {
>>               struct page *page = req->pages[i];
>> -             if (write)
>> +             if (should_dirty)
>>                       set_page_dirty_lock(page);
>>               put_page(page);
>>       }
>> @@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>>                      loff_t *ppos, int flags)
>>   {
>>       int write = flags & FUSE_DIO_WRITE;
>> +     bool should_dirty = !write && iter->type == ITER_IOVEC;
>>       int cuse = flags & FUSE_DIO_CUSE;
>>       struct file *file = io->file;
>>       struct inode *inode = file->f_mapping->host;
>> @@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>>                       nres = fuse_send_read(req, io, pos, nbytes, owner);
>>
>>               if (!io->async)
>> -                     fuse_release_user_pages(req, !write);
>> +                     fuse_release_user_pages(req, should_dirty);
>>               if (req->out.h.error) {
>>                       err = req->out.h.error;
>>                       break;
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>> planning reports. http://sdm.link/zohodev2dev
>
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning reports. http://sdm.link/zohodev2dev
> --
> fuse-devel mailing list
> To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Miklos Szeredi
On Fri, Aug 12, 2016 at 08:28:53PM -0700, Sheng Yang wrote:
> On Fri, Aug 12, 2016 at 11:00 AM, Ashish Samant
> <[hidden email]> wrote:

> >
> > This patch looks good.
> >
> > Tested and Reviewed by : Ashish Samant <[hidden email]>
> >
>
> Hi Miklos,
>
> I just figured it out as well...
>
> And the patch works for me! Thanks!
>
> I am not quite sure what's the impact for stable release though. Seems
> quite a few kernels need to be patched.

Yep, will Cc stable@vger...

Slightly modified patch below.  It replaces

  iter->type == ITER_IOVEC

with

  iter_is_iovec(iter)

While the former also happens to work, it's not proper to check for equality on
a bitfield.

Please nod that I haven't overlooked something and then I'll add the Tested-by
and Reviewed-by tags as well.

Thanks,
Miklos

---
From: Miklos Szeredi <[hidden email]>
Subject: fuse: direct-io: don't dirty ITER_BVEC pages

When reading from a loop device backed by a fuse file it deadlocks on
lock_page().

This is because the page is already locked by the read() operation done on
the loop device.  In this case we don't want to either lock the page or
dirty it.

So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.

Reported-by: Sheng Yang <[hidden email]>
Fixes: aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
Signed-off-by: Miklos Szeredi <[hidden email]>
Cc: <[hidden email]> # v4.1+
---
 fs/fuse/file.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
  req->out.args[0].size = count;
 }
 
-static void fuse_release_user_pages(struct fuse_req *req, int write)
+static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
 {
  unsigned i;
 
  for (i = 0; i < req->num_pages; i++) {
  struct page *page = req->pages[i];
- if (write)
+ if (should_dirty)
  set_page_dirty_lock(page);
  put_page(page);
  }
@@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
        loff_t *ppos, int flags)
 {
  int write = flags & FUSE_DIO_WRITE;
+ bool should_dirty = !write && iter_is_iovec(iter);
  int cuse = flags & FUSE_DIO_CUSE;
  struct file *file = io->file;
  struct inode *inode = file->f_mapping->host;
@@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
  nres = fuse_send_read(req, io, pos, nbytes, owner);
 
  if (!io->async)
- fuse_release_user_pages(req, !write);
+ fuse_release_user_pages(req, should_dirty);
  if (req->out.h.error) {
  err = req->out.h.error;
  break;

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Ashish Samant


On 08/15/2016 01:45 AM, Miklos Szeredi wrote:

> On Fri, Aug 12, 2016 at 08:28:53PM -0700, Sheng Yang wrote:
>> On Fri, Aug 12, 2016 at 11:00 AM, Ashish Samant
>> <[hidden email]> wrote:
>>> This patch looks good.
>>>
>>> Tested and Reviewed by : Ashish Samant <[hidden email]>
>>>
>> Hi Miklos,
>>
>> I just figured it out as well...
>>
>> And the patch works for me! Thanks!
>>
>> I am not quite sure what's the impact for stable release though. Seems
>> quite a few kernels need to be patched.
> Yep, will Cc stable@vger...
>
> Slightly modified patch below.  It replaces
>
>    iter->type == ITER_IOVEC
>
> with
>
>    iter_is_iovec(iter)
>
> While the former also happens to work, it's not proper to check for equality on
> a bitfield.
>
> Please nod that I haven't overlooked something and then I'll add the Tested-by
> and Reviewed-by tags as well.
Looks good !

Thanks,
Ashish

>
> Thanks,
> Miklos
>
> ---
> From: Miklos Szeredi <[hidden email]>
> Subject: fuse: direct-io: don't dirty ITER_BVEC pages
>
> When reading from a loop device backed by a fuse file it deadlocks on
> lock_page().
>
> This is because the page is already locked by the read() operation done on
> the loop device.  In this case we don't want to either lock the page or
> dirty it.
>
> So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.
>
> Reported-by: Sheng Yang <[hidden email]>
> Fixes: aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
> Signed-off-by: Miklos Szeredi <[hidden email]>
> Cc: <[hidden email]> # v4.1+
> ---
>   fs/fuse/file.c |    7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
>   req->out.args[0].size = count;
>   }
>  
> -static void fuse_release_user_pages(struct fuse_req *req, int write)
> +static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
>   {
>   unsigned i;
>  
>   for (i = 0; i < req->num_pages; i++) {
>   struct page *page = req->pages[i];
> - if (write)
> + if (should_dirty)
>   set_page_dirty_lock(page);
>   put_page(page);
>   }
> @@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>         loff_t *ppos, int flags)
>   {
>   int write = flags & FUSE_DIO_WRITE;
> + bool should_dirty = !write && iter_is_iovec(iter);
>   int cuse = flags & FUSE_DIO_CUSE;
>   struct file *file = io->file;
>   struct inode *inode = file->f_mapping->host;
> @@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>   nres = fuse_send_read(req, io, pos, nbytes, owner);
>  
>   if (!io->async)
> - fuse_release_user_pages(req, !write);
> + fuse_release_user_pages(req, should_dirty);
>   if (req->out.h.error) {
>   err = req->out.h.error;
>   break;


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Attaching a loopback device to a direct_io enabled fuse file hangs

Sheng Yang
In reply to this post by Miklos Szeredi
On Mon, Aug 15, 2016 at 1:45 AM, Miklos Szeredi <[hidden email]> wrote:

> On Fri, Aug 12, 2016 at 08:28:53PM -0700, Sheng Yang wrote:
>> On Fri, Aug 12, 2016 at 11:00 AM, Ashish Samant
>> <[hidden email]> wrote:
>
>> >
>> > This patch looks good.
>> >
>> > Tested and Reviewed by : Ashish Samant <[hidden email]>
>> >
>>
>> Hi Miklos,
>>
>> I just figured it out as well...
>>
>> And the patch works for me! Thanks!
>>
>> I am not quite sure what's the impact for stable release though. Seems
>> quite a few kernels need to be patched.
>
> Yep, will Cc stable@vger...
>
> Slightly modified patch below.  It replaces
>
>   iter->type == ITER_IOVEC
>
> with
>
>   iter_is_iovec(iter)
>
> While the former also happens to work, it's not proper to check for equality on
> a bitfield.
>
> Please nod that I haven't overlooked something and then I'll add the Tested-by
> and Reviewed-by tags as well.
>
> Thanks,
> Miklos
>
> ---
> From: Miklos Szeredi <[hidden email]>
> Subject: fuse: direct-io: don't dirty ITER_BVEC pages
>
> When reading from a loop device backed by a fuse file it deadlocks on
> lock_page().
>
> This is because the page is already locked by the read() operation done on
> the loop device.  In this case we don't want to either lock the page or
> dirty it.
>
> So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.
>
> Reported-by: Sheng Yang <[hidden email]>

Tested-by and Reviewed-by Sheng Yang <[hidden email]>

Thanks!

--Sheng

> Fixes: aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
> Signed-off-by: Miklos Szeredi <[hidden email]>
> Cc: <[hidden email]> # v4.1+
> ---
>  fs/fuse/file.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -530,13 +530,13 @@ void fuse_read_fill(struct fuse_req *req
>         req->out.args[0].size = count;
>  }
>
> -static void fuse_release_user_pages(struct fuse_req *req, int write)
> +static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
>  {
>         unsigned i;
>
>         for (i = 0; i < req->num_pages; i++) {
>                 struct page *page = req->pages[i];
> -               if (write)
> +               if (should_dirty)
>                         set_page_dirty_lock(page);
>                 put_page(page);
>         }
> @@ -1320,6 +1320,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>                        loff_t *ppos, int flags)
>  {
>         int write = flags & FUSE_DIO_WRITE;
> +       bool should_dirty = !write && iter_is_iovec(iter);
>         int cuse = flags & FUSE_DIO_CUSE;
>         struct file *file = io->file;
>         struct inode *inode = file->f_mapping->host;
> @@ -1363,7 +1364,7 @@ ssize_t fuse_direct_io(struct fuse_io_pr
>                         nres = fuse_send_read(req, io, pos, nbytes, owner);
>
>                 if (!io->async)
> -                       fuse_release_user_pages(req, !write);
> +                       fuse_release_user_pages(req, should_dirty);
>                 if (req->out.h.error) {
>                         err = req->out.h.error;
>                         break;

------------------------------------------------------------------------------
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel