IETF-SSH archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: sftp rename not good.
I think this disagreement reflects a more basic difference of opinion on
the purpose of the sftp protocol, which needs to be resolved. One side
views sftp as defining a fully abstract storage mechanism, having its own
internally consistent semantics which all compliant implementations must
uphold. The constraint might be stated thus:
"Given a sequence S of sftp operations, and an sftp-observable initial
state I, the sftp-observable result state R of executing S from I must be
the same on any implementation."
In other words, one purpose of sftp is to hide the details of server host
operation in favor of a predictable storage abstraction.
The other side of the fence says no, no -- the purpose of sftp is to
provide convenient remote manipulation of a host's filesystem, in such a
way as to be as familiar as possible to users of the host OS. Thus, each
implementation should be free to choose mappings of basic sftp operations
(such as SSH_FXP_RENAME) onto the server's filesystem primitives, in a way
its authors think will be most useful and familiar to users. In
non-obvious cases, users will have no way of knowing what the mapping will
be, short of reading the software documentation (not the protocol spec),
or just trying it out.
Dan O'Reilly points out that this is what many of his users expect -- the
underlying assumption here is that the "user" is first and foremost a user
of the host OS, merely employing sftp as a way to get at some files when
not directly logged into the host. While I agree this is a common
scenario, it is not the only one. Equally valid is the following: the
user employs an sftp client as his *sole* method of accessing some file
store; he has never logged into the host, nor does he know or care what OS
it's running. One day, the systems department replaces the server with a
new machine running a different OS, but also with an sftp server. Our
putative user then performs some sequence of file manipulations he's done
many times before -- but gets different results! He screams, because
wasn't using a consistent abstraction supposed to protect him from this
sort of thing? Well, that's the question: is sftp supposed to afford such
protection, or not?
Despite the inherent elegance of a fully-abstract model, and its
advantages in some situations, I have to (reluctantly) say that model #2
is probably the way to go, for a number of reasons:
1) The sftp spec as it stands does not articulate or support viewpoint #1
at all. There is no requirement for fully abstract operation, and in
fact, there is recognition of the opposite principle; from section 6.2
(File Names):
"... It is understood that the lack of well-defined semantics for file
names may cause interoperability problems between clients and servers
using radically different operating systems. However, this approach
is known to work acceptably with most systems, and alternative
approaches that e.g. treat file names as sequences of structured
components are quite complicated."
2) The two models solve related but different problems. One is remote
access to different filesystem types via a usable
least-common-denominator protocol, which will necessarily have some
limitations. The other is defining a consistent, server-independent
remote filing protocol. While the second problem is valid and could
use a solution, I think in reality sftp is more geared toward solving
the first.
3) The fully-abstract requirement will severely limit and complicate
implementation. For example, the Mac OS X HFS+ filesystem is
case-preserving but not case-sensitive. In a directory containing
files "foo" and "bar", the result of renaming "foo" -> "Bar" using
naive server-side semantics is going to be *very* different from what a
user of a traditional Unix system would expect! If abstract operation
required case-sensitivity, how would you implement that on the server?
And would it make any sense to someone logging into the server and
viewing the result?
4) Given that people *will* be accessing files both via sftp and via the
host OS, it gets worse -- even if the fully-abstract requirement stated
earlier is met, there's no guarantee the user will be happy with the
server-side result. For example: NTFS allows multiple streams per
file. Sftp has no notion of that, and I imagine that all extant
Windows sftp implementations simply read stream 0. The sftp operation
sequence "create bar; open foo; read foo; write bar; delete foo" will
be sftp-observably identical to "rename foo -> bar" (atomicity and
concurrency issues aside for the moment)... but again with naive
server-side semantics, one will probably end up trashing a multi-stream
file, while the other preserves it.
- Richard Silverman
slade%shore.net@localhost
Home |
Main Index |
Thread Index |
Old Index