[Ksummit-2013-discuss] [ATTEND] Use vprintk_emit() for userspace event communication

Thu Jul 18 15:37:30 UTC 2013

On 07/18/2013 02:14 PM, Hidehiro Kawai wrote:
> Hannes Reinecke wrote:
>> Syslog, OTOH, is well used to be flooded with tons and tons of
>> messages so it does not suffer from this problem.
>> And with the latest printk updates vprintk_emit() has now two
>> different buffers, one for the logging message and another one for
>> structured data. So it would be possible to issue a vprintk_emit()
>> with no message, just the structured data.
>> Such a message would not show up in syslog, but would be available
>> for tools accessing the structured buffer directly.
>> This approach would have the benefit that we would not have to
>> invent yet another mechanism but could use existing, defined interfaces.
>>
>> On the kernel summit I would like to discuss this approach to figure
>> out if this use-case of using vprintk_emit() with just a dictionary
>> meets with general approval or if there are alternative routes for
>> signalling events to userspace.
> 
> It's interesting.  We are trying to handle errors in user space by
> adding a hash value to structured printk output.  Since the hash value is
> generated from the message format, user space can identify the message
> easily and consistently (as long as the message doesn't change).
> This feature aims at general errors detected by kernel, and also
> SCSI device error is our target.
> 
> By the way, my RFC patch can be found here:
> http://thread.gmane.org/gmane.linux.kernel/1519633
> 
Hmm. Yes, I dimly remember that this was the original idea from Kay
Sievers (who presented it at KS two years back).

But for this to work you'd have to audit all kernel messages to ensure
that each message is in fact unique, otherwise you'd be having two
messages with the same hash, but generated at different places.

As for your patch I really do think it would be more sensible to add the
hash as an entry to the dictionary for vprintk_emit() instead of adding
yet another field to printk itself.
That way you wouldn't have to modify the existing caller.

As a sidenote, wouldn't it be sufficient to generate a hash over the
source code line,
ie the output of the __func__ macro?
You still could extract the message string if you so chooses, but the
overall generation would be far easier as you wouldn't care about the
actual string.
So you would avoid the problem of having duplicate strings in the
printk() output.

Cheers,

Hannes