[Openais] does openais need to consider the error happens in the process of receiving a mcast message

Li Huanghai hhli at mail.ustc.edu.cn
Sun Jul 24 22:01:11 PDT 2005


Hi,
    I am puzzled with the openais's exception handling.
When a node sends a message to all nodes,it doesn't
wait for the other nodes' responses of the result that 
does it handle the message correctly. That means once 
a node handle the message error,such as the most malloc 
error, the other nodes won't find it and consider it correct.
Then the cluster is in an inconsistent state and the following
operations will get error result but application consider it 
true. This is a big problem for it is the high-availability
software.

    How to consider this problem? Can it being ignored? If can't,
how to deal with it ? Does it need a rollback policy to keep
all nodes in a consisitent state.


Regards,

LiHuanghai


USTC.NHPCC




More information about the Openais mailing list