Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

Eric W. Biederman
Seth Forshee <[hidden email]> writes:

> Both of these filesystems already have use cases for mounting the
> same super block from multiple user namespaces. For sysfs this
> happens when using criu for snapshotting a container, where sysfs
> is mnounted in the containers network ns but the hosts user ns.
> The cgroup filesystem shares the same super block for all mounts
> of the same hierarchy regardless of the namespace.
>
> As a result, the restriction on mounting a super block from a
> single user namespace creates regressions for existing uses of
> these filesystems. For these specific filesystems this
> restriction isn't really necessary since the backing store is
> objects in kernel memory and thus the ids assigned from inodes
> is not subject to translation relative to s_user_ns.
>
> Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
> causes sget_userns() to skip the check of s_user_ns. Set this
> flag for the sysfs and cgroup filesystems to fix the
> regressions.

So this one needs to be sget_userns(..., &init_user_ns, ...).
And not a new special case.

Apologies for not catching this earlier.

I am looking at folding all of this into the patch that introduces
sget_userns so that even bisects won't have regresssions.

We loose the ability to call mount -o remount and actually affect
these filesystems (which we don't have without s_user_ns) but we gain a
whole lot of simplicity, and we don't break the xattr and security label
on sysfs code.

Eric

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

Seth Forshee
On Tue, May 17, 2016 at 05:39:33PM -0500, Eric W. Biederman wrote:

> Seth Forshee <[hidden email]> writes:
>
> > Both of these filesystems already have use cases for mounting the
> > same super block from multiple user namespaces. For sysfs this
> > happens when using criu for snapshotting a container, where sysfs
> > is mnounted in the containers network ns but the hosts user ns.
> > The cgroup filesystem shares the same super block for all mounts
> > of the same hierarchy regardless of the namespace.
> >
> > As a result, the restriction on mounting a super block from a
> > single user namespace creates regressions for existing uses of
> > these filesystems. For these specific filesystems this
> > restriction isn't really necessary since the backing store is
> > objects in kernel memory and thus the ids assigned from inodes
> > is not subject to translation relative to s_user_ns.
> >
> > Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
> > causes sget_userns() to skip the check of s_user_ns. Set this
> > flag for the sysfs and cgroup filesystems to fix the
> > regressions.
>
> So this one needs to be sget_userns(..., &init_user_ns, ...).
> And not a new special case.

This is actually what I wanted to do, but based on a previous discussion
where I had suggested doing this (for a different reason) I came away
thinking you did not want it that way. So I'm happy with that change.

But if we do that it violates some of the assumptions of the patch to
rework MNT_NODEV on your testing branch (and also those behind patch 2
in this series). Something will need to be changed there to prevent a
regression in mount behavior when a user ns tries to mount without
MNT_NODEV when the mount inherited from its parent has it set.

> Apologies for not catching this earlier.

Actually this is a more recent patch, so you possibly hadn't seen it
before.

> I am looking at folding all of this into the patch that introduces
> sget_userns so that even bisects won't have regresssions.

That's fine with me.

Thanks,
Seth


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

Eric W. Biederman
Seth Forshee <[hidden email]> writes:

> On Tue, May 17, 2016 at 05:39:33PM -0500, Eric W. Biederman wrote:
>> Seth Forshee <[hidden email]> writes:
>>
>> > Both of these filesystems already have use cases for mounting the
>> > same super block from multiple user namespaces. For sysfs this
>> > happens when using criu for snapshotting a container, where sysfs
>> > is mnounted in the containers network ns but the hosts user ns.
>> > The cgroup filesystem shares the same super block for all mounts
>> > of the same hierarchy regardless of the namespace.
>> >
>> > As a result, the restriction on mounting a super block from a
>> > single user namespace creates regressions for existing uses of
>> > these filesystems. For these specific filesystems this
>> > restriction isn't really necessary since the backing store is
>> > objects in kernel memory and thus the ids assigned from inodes
>> > is not subject to translation relative to s_user_ns.
>> >
>> > Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
>> > causes sget_userns() to skip the check of s_user_ns. Set this
>> > flag for the sysfs and cgroup filesystems to fix the
>> > regressions.
>>
>> So this one needs to be sget_userns(..., &init_user_ns, ...).
>> And not a new special case.
>
> This is actually what I wanted to do, but based on a previous discussion
> where I had suggested doing this (for a different reason) I came away
> thinking you did not want it that way. So I'm happy with that change.

Yeah.  Somedays it seems like there are a lot of pieces in play here.
The security labels on sysfs seems to be a very compelling case.

> But if we do that it violates some of the assumptions of the patch to
> rework MNT_NODEV on your testing branch (and also those behind patch 2
> in this series). Something will need to be changed there to prevent a
> regression in mount behavior when a user ns tries to mount without
> MNT_NODEV when the mount inherited from its parent has it set.

Thank you for pointing that out.  I will look into that.

I believe I know exactly what you are talking about.  Of the choices I
think it is better to a minor localized change in the fs_fully_visible
logic than it is to cause problems elsewhere.

>> Apologies for not catching this earlier.
>
> Actually this is a more recent patch, so you possibly hadn't seen it
> before.
>
>> I am looking at folding all of this into the patch that introduces
>> sget_userns so that even bisects won't have regresssions.
>
> That's fine with me.

And thank you for keeping everything as separate patches.  That is at
least helping me catch up.  Even if I don't agree that these things
should be separate come merge time.

Eric


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

Seth Forshee
On Wed, May 18, 2016 at 10:45:31AM -0500, Eric W. Biederman wrote:

> > But if we do that it violates some of the assumptions of the patch to
> > rework MNT_NODEV on your testing branch (and also those behind patch 2
> > in this series). Something will need to be changed there to prevent a
> > regression in mount behavior when a user ns tries to mount without
> > MNT_NODEV when the mount inherited from its parent has it set.
>
> Thank you for pointing that out.  I will look into that.
>
> I believe I know exactly what you are talking about.  Of the choices I
> think it is better to a minor localized change in the fs_fully_visible
> logic than it is to cause problems elsewhere.

Agreed.

> >> Apologies for not catching this earlier.
> >
> > Actually this is a more recent patch, so you possibly hadn't seen it
> > before.
> >
> >> I am looking at folding all of this into the patch that introduces
> >> sget_userns so that even bisects won't have regresssions.
> >
> > That's fine with me.
>
> And thank you for keeping everything as separate patches.  That is at
> least helping me catch up.  Even if I don't agree that these things
> should be separate come merge time.

Honestly I probably would have squashed some of them into that first
patch myself if you hadn't already applied it to your testing branch, so
that's all just luck.

Keep in mind that I also have that patch for mqueue that isn't in this
series, and I haven't yet checked to see if the 4.7 merges introduce
anything which is going to require updating these patches. I was
planning to wait and send out updates after -rc1, but if you want that
stuff sooner just let me know.

Thanks,
Seth

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

Eric W. Biederman
Seth Forshee <[hidden email]> writes:

> On Wed, May 18, 2016 at 10:45:31AM -0500, Eric W. Biederman wrote:
>> > But if we do that it violates some of the assumptions of the patch to
>> > rework MNT_NODEV on your testing branch (and also those behind patch 2
>> > in this series). Something will need to be changed there to prevent a
>> > regression in mount behavior when a user ns tries to mount without
>> > MNT_NODEV when the mount inherited from its parent has it set.
>>
>> Thank you for pointing that out.  I will look into that.
>>
>> I believe I know exactly what you are talking about.  Of the choices I
>> think it is better to a minor localized change in the fs_fully_visible
>> logic than it is to cause problems elsewhere.
>
> Agreed.
>
>> >> Apologies for not catching this earlier.
>> >
>> > Actually this is a more recent patch, so you possibly hadn't seen it
>> > before.
>> >
>> >> I am looking at folding all of this into the patch that introduces
>> >> sget_userns so that even bisects won't have regresssions.
>> >
>> > That's fine with me.
>>
>> And thank you for keeping everything as separate patches.  That is at
>> least helping me catch up.  Even if I don't agree that these things
>> should be separate come merge time.
>
> Honestly I probably would have squashed some of them into that first
> patch myself if you hadn't already applied it to your testing branch, so
> that's all just luck.
>
> Keep in mind that I also have that patch for mqueue that isn't in this
> series, and I haven't yet checked to see if the 4.7 merges introduce
> anything which is going to require updating these patches. I was
> planning to wait and send out updates after -rc1, but if you want that
> stuff sooner just let me know.

As unfortunately I don't have anything going into -rc1 I am working on
this right now.

Let me finish sorting out the sget_userns and mnt nodev mess and I will
push something out and then we can compare notes.  I think I have mqueue
covered by other changes.  As it is in the set of filesystems that
should just use sget_userns.

I am sorting through the nodev corner of this now.  It should just be a
day or two.

Eric


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
--
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel