Bypassing directories?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Bypassing directories?

Stuart Maclean
Hi from a fuse newbie.

I am trying to essentially bypass the 'directory structure' feature of
fuse, and of file systems in general.  I am not sure if it can be done.  
My file system is to be a 'write only' one, and I want any path to be
accepted, much like git simply 'associates' a string that is a path with
some file content, without actually viewing that string as a filesystem
path.

So, under my fuse point, I want any file open/write/release sequence to
succeed.  It need not matter if a file is 'overwritten', so the
following would all work, with the fuse mount point being 'foo':

$ cp someFile foo/

$ cp otherFile foo/someName

$ cp anotherFile foo/someDir/otherFile

There will be no symlinking, mknods, etc.

The file system is unreadable, so:

$ ls foo

can return empty or even fail.

As I said earlier, I am trying to following the git model, of
associating an arbitrary string S with some content C.  The fact that S
looks like a file path is a coincidence.

Currently, I am getting bogged down in the getattr function and LOOKUP???

I am on Linux 3.13, fuse 2.9.2.  My actual target system, if/when this
ever works, will be a legacy Linux 2.6.10 kernel.

Any help gratefully appreciated.

Stuart




------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Joshua Juran-2
On Nov 4, 2015, at 2:54 AM, Stuart Maclean <[hidden email]> wrote:

> I am trying to essentially bypass the 'directory structure' feature of
> fuse, and of file systems in general.  I am not sure if it can be done.  
> My file system is to be a 'write only' one, and I want any path to be
> accepted, much like git simply 'associates' a string that is a path with
> some file content, without actually viewing that string as a filesystem
> path.

Actually, directories in Git are a hierarchical Merkle tree, like commits except for being a strict hierarchy (instead of an arbitrary DAG) so as to mirror the filesystem.

> So, under my fuse point, I want any file open/write/release sequence to
> succeed.  It need not matter if a file is 'overwritten', so the
> following would all work, with the fuse mount point being 'foo':
>
> $ cp someFile foo/
>
> $ cp otherFile foo/someName
>
> $ cp anotherFile foo/someDir/otherFile

I don't think that's going to work.  Unless I'm mistaken, FUSE is going to request a stat for each component, and you won't know in advance whether it's meant to be a directory or file.

What about replacing '/' with ':' at the user end?  Then the filesystem can be a flat directory of files, and for any lookup you can report a file existing.

Also consider whether FUSE is the right mechanism here, and if some other API might not better serve your needs.

Josh


------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Stuart Maclean
Hi Josh, all,

thanks for the prompt reply.  I agree with your concerns, it's just not
going to work in FUSE, and I have no idea how else it might be done.  
Like you say, fuse is requesting a stat (via getattr) and my filesystem
has no way of replying, since my filesystem is completely write-only.

My filesystem is actually just a list of pairs of element type (P,C)
where P is a string that resembles a Unix file path, and C is associated
content.  The list can only grow, at the tail, never shrink.  I think of
the list as like a crude FAT (DOS?) There's a FAT and a data area.  And
there is no requirement that all Ps are unique.  I could have P1, C1 and
later add P1, C2, or even P1, C1 again.

Further, I can only write a P,C pair, I cannot later read back any P nor
C.  I know it sounds bizarre, having a write-only FAT, but it's what I need.

I can live with a flat filesystem under the mount point, i.e. no
directories allowed at all. I don't really want to put the onus on the
user to switch to e.g. ':' separators as a special meaning instead of
'/'.  The users will just have encode their own logical data separation
into the flat file names.

Come to think if it, where in the kernel is the '/' designated as the
path separator??  I imagine somewhere in the guts (VFS??) that FUSE
could not change.  If I could 'trick' the system into believing that
some other char is the separator, then my user could use 'foo/bar/baz'
while the VFS/FUSE system would consider this as flat.  I can't imagine
that working ;)

Stuart

On 11/04/2015 07:34 AM, Josh Juran wrote:

> On Nov 4, 2015, at 2:54 AM, Stuart Maclean <[hidden email]> wrote:
>
>> I am trying to essentially bypass the 'directory structure' feature of
>> fuse, and of file systems in general.  I am not sure if it can be done.
>> My file system is to be a 'write only' one, and I want any path to be
>> accepted, much like git simply 'associates' a string that is a path with
>> some file content, without actually viewing that string as a filesystem
>> path.
> Actually, directories in Git are a hierarchical Merkle tree, like commits except for being a strict hierarchy (instead of an arbitrary DAG) so as to mirror the filesystem.
>
>> So, under my fuse point, I want any file open/write/release sequence to
>> succeed.  It need not matter if a file is 'overwritten', so the
>> following would all work, with the fuse mount point being 'foo':
>>
>> $ cp someFile foo/
>>
>> $ cp otherFile foo/someName
>>
>> $ cp anotherFile foo/someDir/otherFile
> I don't think that's going to work.  Unless I'm mistaken, FUSE is going to request a stat for each component, and you won't know in advance whether it's meant to be a directory or file.
>
> What about replacing '/' with ':' at the user end?  Then the filesystem can be a flat directory of files, and for any lookup you can report a file existing.
>
> Also consider whether FUSE is the right mechanism here, and if some other API might not better serve your needs.
>
> Josh
>


------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Antonio SJ Musumeci
If I understand what you want you certainly can do that with FUSE.

You at some point need to know the underlying data. If you don't want to
show it then simply don't reply in readdir with anything. Stat can always
reply with data based on the input path and then put the real logic in
open/write/release. If you are write once... check if it exists and then
return EPERM or the like.


On Wed, Nov 4, 2015 at 12:22 PM, Stuart Maclean <[hidden email]>
wrote:

> Hi Josh, all,
>
> thanks for the prompt reply.  I agree with your concerns, it's just not
> going to work in FUSE, and I have no idea how else it might be done.
> Like you say, fuse is requesting a stat (via getattr) and my filesystem
> has no way of replying, since my filesystem is completely write-only.
>
> My filesystem is actually just a list of pairs of element type (P,C)
> where P is a string that resembles a Unix file path, and C is associated
> content.  The list can only grow, at the tail, never shrink.  I think of
> the list as like a crude FAT (DOS?) There's a FAT and a data area.  And
> there is no requirement that all Ps are unique.  I could have P1, C1 and
> later add P1, C2, or even P1, C1 again.
>
> Further, I can only write a P,C pair, I cannot later read back any P nor
> C.  I know it sounds bizarre, having a write-only FAT, but it's what I
> need.
>
> I can live with a flat filesystem under the mount point, i.e. no
> directories allowed at all. I don't really want to put the onus on the
> user to switch to e.g. ':' separators as a special meaning instead of
> '/'.  The users will just have encode their own logical data separation
> into the flat file names.
>
> Come to think if it, where in the kernel is the '/' designated as the
> path separator??  I imagine somewhere in the guts (VFS??) that FUSE
> could not change.  If I could 'trick' the system into believing that
> some other char is the separator, then my user could use 'foo/bar/baz'
> while the VFS/FUSE system would consider this as flat.  I can't imagine
> that working ;)
>
> Stuart
>
> On 11/04/2015 07:34 AM, Josh Juran wrote:
> > On Nov 4, 2015, at 2:54 AM, Stuart Maclean <[hidden email]>
> wrote:
> >
> >> I am trying to essentially bypass the 'directory structure' feature of
> >> fuse, and of file systems in general.  I am not sure if it can be done.
> >> My file system is to be a 'write only' one, and I want any path to be
> >> accepted, much like git simply 'associates' a string that is a path with
> >> some file content, without actually viewing that string as a filesystem
> >> path.
> > Actually, directories in Git are a hierarchical Merkle tree, like
> commits except for being a strict hierarchy (instead of an arbitrary DAG)
> so as to mirror the filesystem.
> >
> >> So, under my fuse point, I want any file open/write/release sequence to
> >> succeed.  It need not matter if a file is 'overwritten', so the
> >> following would all work, with the fuse mount point being 'foo':
> >>
> >> $ cp someFile foo/
> >>
> >> $ cp otherFile foo/someName
> >>
> >> $ cp anotherFile foo/someDir/otherFile
> > I don't think that's going to work.  Unless I'm mistaken, FUSE is going
> to request a stat for each component, and you won't know in advance whether
> it's meant to be a directory or file.
> >
> > What about replacing '/' with ':' at the user end?  Then the filesystem
> can be a flat directory of files, and for any lookup you can report a file
> existing.
> >
> > Also consider whether FUSE is the right mechanism here, and if some
> other API might not better serve your needs.
> >
> > Josh
> >
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> fuse-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/fuse-devel
>
------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Stuart Maclean
On 11/04/2015 09:55 AM, Antonio SJ Musumeci wrote:
> If I understand what you want you certainly can do that with FUSE.
>
> You at some point need to know the underlying data. If you don't want to
> show it then simply don't reply in readdir with anything. Stat can always
> reply with data based on the input path and then put the real logic in
> open/write/release. If you are write once... check if it exists and then
> return EPERM or the like.
>
I am not sure what you mean by 'know the underlying data'.  I want the
following to work, referring to my naming of P for path and C for
content, mount point is 'foo'

Write 1: P = foo/data, C = 'X'

Write 2: P = foo/bucket1/data, C = 'Y'

Write 3: P = foo/bucket2/data, C = 'Z'

Here are the problems:

write 1: getattr(/) - OK, I can at least reply that that is a dir.  In
fact, that path is the ONLY one for which I can return 0 from getattr,
all other paths I cannot return anything other than -ENOENT.

write 1: getattr(/data) - I can only reply -ENOENT since I have no way
of looking up any names P.

The ENOENT is actually OK here, an open+write on /data will work at the
fuse level.

But writes 2, 3 won't work, since if always return -ENOENT, the getattr
sequence on longer and longer paths (/, /bucket1, /bucket1/data) will be
cut short by the first -ENOENT for /bucket1, so I never even get a call
to getattr(/bucket1/data), nor any following open nor write.

Can you explain how EPERM is used??

Regards

Stuart


------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Antonio SJ Musumeci
Why not just require mkdir's then? You have to know the underlying
structure at some point to manage this. Just because it's write only
doesn't mean the code can't keep track in memory what it's already done. If
you don't want "ls" to work then don't return anything from readdir. If you
don't want stats to work on explicitly defined paths then keep track of
what's been written / created already and after the first one return it
doesn't exist. If you want to have different types you *have* to have some
context.


On Wed, Nov 4, 2015 at 1:12 PM, Stuart Maclean <[hidden email]>
wrote:

> On 11/04/2015 09:55 AM, Antonio SJ Musumeci wrote:
> > If I understand what you want you certainly can do that with FUSE.
> >
> > You at some point need to know the underlying data. If you don't want to
> > show it then simply don't reply in readdir with anything. Stat can always
> > reply with data based on the input path and then put the real logic in
> > open/write/release. If you are write once... check if it exists and then
> > return EPERM or the like.
> >
> I am not sure what you mean by 'know the underlying data'.  I want the
> following to work, referring to my naming of P for path and C for
> content, mount point is 'foo'
>
> Write 1: P = foo/data, C = 'X'
>
> Write 2: P = foo/bucket1/data, C = 'Y'
>
> Write 3: P = foo/bucket2/data, C = 'Z'
>
> Here are the problems:
>
> write 1: getattr(/) - OK, I can at least reply that that is a dir.  In
> fact, that path is the ONLY one for which I can return 0 from getattr,
> all other paths I cannot return anything other than -ENOENT.
>
> write 1: getattr(/data) - I can only reply -ENOENT since I have no way
> of looking up any names P.
>
> The ENOENT is actually OK here, an open+write on /data will work at the
> fuse level.
>
> But writes 2, 3 won't work, since if always return -ENOENT, the getattr
> sequence on longer and longer paths (/, /bucket1, /bucket1/data) will be
> cut short by the first -ENOENT for /bucket1, so I never even get a call
> to getattr(/bucket1/data), nor any following open nor write.
>
> Can you explain how EPERM is used??
>
> Regards
>
> Stuart
>
>
------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Stuart Maclean
On 11/04/2015 11:05 AM, Antonio SJ Musumeci wrote:

> Why not just require mkdir's then? You have to know the underlying
> structure at some point to manage this. Just because it's write only
> doesn't mean the code can't keep track in memory what it's already done. If
> you don't want "ls" to work then don't return anything from readdir. If you
> don't want stats to work on explicitly defined paths then keep track of
> what's been written / created already and after the first one return it
> doesn't exist. If you want to have different types you *have* to have some
> context.
>
>

I can keep track of paths P in memory, but not to my backing store.  And
if my fuse daemon ever goes down, as it will of course across reboots,
I'll have no way of recovering any P values, they are not readable from
the backing store, only writable to it.

I concede this is a hair-brained idea, I'll just live with no directory
structure.

Stuart



------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bypassing directories?

Antonio SJ Musumeci
Why not store it out of band? On the local file system? If the backing
store is truly write only (not even metadata can be queried) then fronting
it with anything which wants to be intelligent is going to be difficult.
Unless you can just act on it and interpret any of the output of the action
in a way that helps.

On Wed, Nov 4, 2015 at 2:24 PM, Stuart Maclean <[hidden email]>
wrote:

> On 11/04/2015 11:05 AM, Antonio SJ Musumeci wrote:
> > Why not just require mkdir's then? You have to know the underlying
> > structure at some point to manage this. Just because it's write only
> > doesn't mean the code can't keep track in memory what it's already done.
> If
> > you don't want "ls" to work then don't return anything from readdir. If
> you
> > don't want stats to work on explicitly defined paths then keep track of
> > what's been written / created already and after the first one return it
> > doesn't exist. If you want to have different types you *have* to have
> some
> > context.
> >
> >
>
> I can keep track of paths P in memory, but not to my backing store.  And
> if my fuse daemon ever goes down, as it will of course across reboots,
> I'll have no way of recovering any P values, they are not readable from
> the backing store, only writable to it.
>
> I concede this is a hair-brained idea, I'll just live with no directory
> structure.
>
> Stuart
>
>
>
------------------------------------------------------------------------------
_______________________________________________
fuse-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/fuse-devel