[cgl_discussion] Re: [OCF]draft 0.8 of the SAF Application Interface specification

David Brower David.Brower at oracle.com
Thu Jan 9 11:05:00 PST 2003

Frederic Herrmann wrote:

>>From: David Brower <David.Brower at oracle.com>

>>>Just to add a small clarification to MiKu's observation: The spec uses 
>>>calls in the examples but it does not require the underlying OS to be 
>>>compliant. Small footprint / RT implementations of the SAF middleware may 
>>>find POSIX compliance of the OS too onerous.
>>Au contraire, it assumes POSIX-like semantics every place it uses a 
> integer
>>file descriptor, and assumes the presence of something like select or 
> poll.
> In the SAF application spec, an opaque type (SaSelectionObject) is returned 
> to the application. This may be a file desc on a Posix system or something 
> else on other systems. There is no assumption on select or poll explicitly 
> by the spec but only an assumption that the application has a way to wait 
> on the selection object returned previously. 

In 5.1.2, where it introduces the type, it says, "The file descriptor
for select or poll".   It doesn't provide an API for doing the poll,
so it makes the assumption that the user knows it's a FD and can do
"the right thing"; but formally, the type should be opaque, with API
provided functions to do the wait.  The assumption that an app has
a way to wait is not valid, I do not believe.

Simlarly, the SASystemNotifierT lets you get a handle, but you can't
do anything with it.  There are numberous HandleGet operations that
all return different types, eg, the saSelectionObject, the 
saSystemSyncronizatoinObject, the saSystemNotifierT, etc.  It is not
promised that there is a common wait point for them all (which may
be correct non-promise).  It really only holds together if one assumes
they are FDs and one can use select/poll.

OCF is unapologeticly fd based, which is fine for what it
is targeted to be.   CGL probably doesn't care either, and maybe
SAF doesn't, because it is really only going to be used on UNIX
variants and maybe Windows.  If so, it's may be a good idea to
adopt that position and say so, allowing direct use of the OS
primitives and types where appropriate.  eg, having explicit
GetFD and GetWaitObject calls for the opaque handles.   It would
still be a good idea to provide generic Wait APIs for the handles,
in case the implmentor uses something for which the generic OS wait
won't work (some shared memory thing for instance.)

> It is true that we didn't look at many OSes such as OS/390 to make sure 
> that there is a way to map the abstract type on something reasonable but 
> someone in our group checked that this could be easily mapped on NT 
> (without using the POSIX emulation).

It probably works OK until you mix sockets with Objects, resulting
in the need to mix waits.  The "mu" answer is wait for the different
things in different threads.  This isn't really SAF's fault, but
it is indicative of the problems in a unified "waitfor" API.

>>As it stands, it is about as POSIX specific as the OCF drafts, with 
>>additional complication and embelleshment. 
> I would appreciate if you could point out the parts which are unnecessary 
> complicated so that we can give it a try at simplifying them.

Well, OCF doesn't currently include checkpoint or DLM; that is what I meant
as embellished features compared to OCF.    I don't think I ever said
"unnecessary" (that being something I don't want to decide).   The existence
of those features adds huge complexity.  Compare the thickness of the SAF
0.8 to the OCF stuff.  (Given that, the attempt to "simplfy" the DLM semantics
struck me as ironic, leaving a partial DLM with no way for clients to portably 
use other features, like lock conversion.)

>>I think OCF could align with a subset
>>of the SAF stuff easily.   I can't see SAF as meeting the 'agnostic' 
> claim.
>>Using the perrenial worst case example, it would probably be a pain to
>>make that API work on OS/390 without the  POSIX emulation layer
>>that many don't like to use.
> I'm not familiar with OS/390 and I would appreciate if you could point out 
> the parts of the spec which you think would be hard to implement on OS/390.

I am unfortunately on the wrong side of the UNIX/390 divide to speak
with authority.  I have been mugged for what have been presented to me as 
unixisms.  An example is assuming a daemon environment with IPC between 
processes in a request response.   I'm told the right thing to do is
something like a cross-address space call resulting in the calling thread 
executing in the "daemon" address space with the "daemons" access rights.
Or to be completely async somehow.   I don't always follow. Essentially, 
whatever seems "normal" to me is bizzare and anathema there somehow.   The 
most telling example is that "process" has no meaning there, and the
conceptual mapping of process to address space and tasks is very

The main issue is completeness and correctness of the type system.  It
is arguably the case that the whole API library must be provided by a
single party that can implement the common things that are used by
higher level services.  The common things are largely OS handle manipulation,
and wait integration.  I guess the section 11 error handling sort of
goes there too.  It would be tough for a third party to provide
one of the services using a differnt ipc/transport mechanism that did not
align with the types and wait scheme used by other components.  Similarly,
the error reporting centralization seems to beg single provider.


-dB, without portfolio to get deeply involved.

More information about the cgl_discussion mailing list