[Fuego] How do I reboot the DUT if system hang?

Tim.Bird at sony.com Tim.Bird at sony.com
Wed Feb 24 16:23:41 UTC 2021


Thanks for the confirmation that this technique worked for you.

It is helpful to get feedback on (relatively) new features, to make sure
they work outside my own lab.
 -- Tim


> -----Original Message-----
> From: Daniel Lin 林源祥 <daniel.lin at cvitek.com>
> 
> Hi Tim,
> 
> Thanks soooo much for your reply.It's so helpful.
> But sorry for my late reply.
> 
> I follow your instructions to setup hardware reboot and it worked!
> I use USBRelay(a usb device can control usb port to power up/down) to control board if board hang.
> 
> step1: Install fuego with usb privileged option.for example
> $./install.sh --priv myfuego
> 
> step2:
> Here is my setting as below under ~/fuego/fuego-ro/boards/dut.board
> 
> BOARD_CONTROL="custom"
> 
> override-func ov_board_control_reboot() {
>     echo "Turning off power to board (in ov_board_control_reboot)"
>     sleep 3
>     #USBRelay_on.py let DUT power off and on via USB port(ex. /dev/ttyUSB1)
>     #chmod 666 /dev/ttyUSB1 if you encounter access deny with USB port.
>     cd /home
>     python3 USBRelay_on.py
>     echo "Turning on power to board(in ov_board_control_reboot)"
>     sleep 40
> }
> 
> While the DUT is no response or no network,then fuego start the hardware reboot by " python3 USBRelay_on.py"
> So all the testcases can be separate if specific testcase caused the DUT hang.
> 
> Thanks again.
> Daniel Lin
> 
> -----原始郵件-----
> 寄件者: Tim.Bird at sony.com <Tim.Bird at sony.com>
> 寄件日期: 2021年2月3日 上午 04:19
> 收件者: Daniel Lin 林源祥 <daniel.lin at cvitek.com>; fuego at lists.linuxfoundation.org
> 主旨: RE: [Fuego] How do I reboot the DUT if system hang?
> 
> > -----Original Message-----
> > From: Fuego <fuego-bounces at lists.linuxfoundation.org> On Behalf Of Daniel Lin ???
> >
> > Hi Tim & all,
> >
> > Currently I often encounter the system hang while testing in fuego.
> > When the system hang  happened,the following the test cases can not be ran.
> > I read the about the PDU & PDU daemon on fuego pdfs,but I don’t understand what is it.
> >
> > So anyone can point me how to reboot the DUT if system hang via hardware or so-called PDU?
> >
> 
> Hello Daniel,
> 
> Here is a some information about how Fuego handles board reboot.
> 
> By default, Fuego provides the 'target_reboot' command to perform a software
> reboot of the board.  This executes "/sbin/reboot" on the board.   When Fuego
> tries to reboot a board, it first attempts a software reboot (using 'target_reboot'), and if the board is unresponsive after that, it performs a
> hardware reboot using the overlay function 'ov_board_control_reboot'.  This overlay function is located in the file: fuego-
> core/scripts/overlays/base-board.fuegoclass.
> 
> This routine can be overridden in a board file, with an override-func definition.
> (I'll show an example of that below).
> 
> The 'ov_board_control_reboot' function (by default) calls 'ftc power-cycle'
> for a board, which in turn uses information in the board file to perform the actual board power switching to reboot the board.  ftc
> currently has support for two different board control systems: 'pdudaemon' and 'ttc'.  I will describe these more below.
> 
> To answer your question, there are there are two main options for configuring and implementing hardware reboot for a board in Fuego:
> 
> Option 1)  override the ov_board_control_reboot function, by placing an override-func in the Fuego board file.
> Option 2) add support for your board to either pdudaemon or ttc, and then configure the board file to reference those board controller
> systems
> 
> Unless you are already using pdudaemon or ttc to control boards in your lab, I think that it makes more sense to choose option 1.
> 
> Option 1 requires that you be able to cause a hardware reboot or to power cycle your board using command line programs or shell scripts
> that are available
> inside your Fuego docker container.   This means you must have some pre-existing
> method of rebooting your board.  It does not matter what the method is, as long as Fuego can use it from inside the docker container, as
> user 'jenkins' (which is the user id in effect during test execution inside the docker container).
> 
> If you do not have a way to control power to your board, then you will need to come up with something.  In my own lab, I have a mixture
> of USB-based control boards, serial-based control boards, web power switches, and web relay devices that are used for different boards in
> my lab.  For each of these I have a command line program which is used to control the power for a particular board under test.
> 
> If you have a script called "reboot-my-board" that can perform a reboot of your board, then make sure that program is callable from inside
> the docker container.  I would recommend putting it into /usr/local/bin inside the container.
> 
> You also need to set the value of BOARD_CONTROL in the board file, to indicate that you have your own ov_board_control_reboot()
> function.  In this case, set the value to 'custom'.
> 
> BOARD_CONTROL="custom"
> 
> Then add an override-func for ov_board_control_reboot() in the board file for the board that you need to reboot.  If your board is named
> "mybbb" the board file would be: /fuego-ro/boards/mybbb.board.  The function declaration in the board file might look something like
> this:
> 
> override-func ov_board_control_reboot() {
>      reboot-my-board ; sleep 20
> }
> 
> Whether you need a sleep or not depends on what 'reboot-my-board' does.  In general, it is safest to add the delay for hardware reboot
> into the ov_board_control_reboot function, if it is not provided by the called program.
> 
> You do not have to use just a single command.  Basically, if you can make the reboot happen with a series of commands, then you can put
> those into your override function in your board file.
> 
> Below is an example from my own lab.
> In this particular example, I am using a controller board which is attached to the Fuego host via a serial port, and which accepts the
> commands "v" and "V" on the serial line to turn off (or on, respectively) the voltage to the board under test.
> 
> override-func ov_board_control_reboot() {
>     echo "Turning off power to board (in ov_board_control_reboot)"
>     echo v >/dev/serial/by-id/usb-wj at xnk.nu_CDB_Assist_00000042-if02
>     sleep 3
>     echo "Turning on power to board (in ov_board_control_reboot)"
>     echo V >/dev/serial/by-id/usb-wj at xnk.nu_CDB_Assist_00000042-if02
>     echo "Waiting for board to boot (in ov_board_control_reboot)"
>     sleep 40
> }
> 
> My definition includes some debug echo statements to help see what is going on.  These messages would appear in the console log if this
> function is called during test execution.
> 
> Option 2) Using an existing board control system.
> The other option is to use an existing board control system, such as pdudaemon or ttc.  In general, pdudaemon is a network-based control
> system, and ttc is a command-line based control system. The description of setting up and configuring either of these systems is outside
> the scope of this e-mail.  However, one important note is that it must be possible to effectively use these systems from inside the Fuego
> docker container.  For pdudaemon, this is usually not an issue, since it can be accessed via the network (Fuego does not use pduclient).
> For ttc, this means that any helper scripts that are used by ttc must be available inside the container (and must work from inside the
> container).  If ttc or it's helpers needs to communicate via a serial port or USB port, then a privileged docker container should be used to
> allow for this.
> (see Fuego's install.sh '--priv' option).
> 
> Even though setting up pdudaemon or ttc is outside the scope of this message, here is some information about how these systems are
> integrated into Fuego:
> 
> If using pdudaemon as your board control system, then add the following to the Fuego board file for the board under test:
> BOARD_CONTROL="pdudaemon"
> PDUDAEMON_HOSTNAME="<the hostname>"
> PDUDAEMON_PORT="<the port>"
> PDUDAEMON_DELAY="<value in seconds>"
> 
> The delay value is optional.
> 
> If using ttc as your board control system, then add the following to the Fuego board file for the board under test:
> BOARD_CONTROL="ttc"
> TTC_TARGET="<the target name>"
> 
> TTC_TARGET is not needed if the ttc target name is the same as the Fuego board name.
> 
> Once properly configured, and the settings are in the Fuego board file, you should be able to do the following operations on a board, using
> the 'ftc' command.
>   ftc power-cycle -b <board-name>
>   ftc power-off -b <board-name>
>   ftc power-on -b <board-name>
> 
> Executing these command line operations is a good way to test that the configuration is correct for your selected board control system.
> 
> Finally, there is a test in Fuego to validate that software and hardware reboot for you board are working correctly.  It is the test
> 'Functional.reboot'.  Please try it on your system after you have configured one of the above options, and let me know your results.
> 
> I hope this is helpful.  Please do not hesitate to ask if you have any additional questions.
>  -- Tim
> 
> Resources:
> pdudaemon: https://github.com/pdudaemon/pdudaemon
> ttc: https://github.com/tbird20d/ttc
> 
> Note that the Fuego docker container already has the 'ttc' command installed, but you need to configure 'ttc' to use it.



More information about the Fuego mailing list