[Fuego] 答复: [PATCH] fix_Benchmark_cyclictest_with_env

Li, Xiaoming lixm.fnst at cn.fujitsu.com
Fri Nov 2 03:40:13 UTC 2018


Hi Daniel,
CC Tim

I have modifed and re-submitted this patch, which just add a "warning".

--[PATCH v2] Benchmark.cyclictest: add a warning for rt

I am looking forward to your reply.


BR, Li

-----邮件原件-----
发件人: Daniel Sangorrin [mailto:daniel.sangorrin at toshiba.co.jp] 
发送时间: 2018年10月19日 14:14
收件人: Li, Xiaoming/李 霄鸣 <lixm.fnst at cn.fujitsu.com>; fuego at lists.linuxfoundation.org
主题: RE: [Fuego] [PATCH] fix_Benchmark_cyclictest_with_env

Hi Li,

> -----Original Message-----
> From: Li, Xiaoming <lixm.fnst at cn.fujitsu.com>
> Sent: Thursday, October 18, 2018 10:44 AM
> To: Daniel Sangorrin <daniel.sangorrin at toshiba.co.jp>; 
> fuego at lists.linuxfoundation.org
> Subject: 答复: [Fuego] [PATCH] fix_Benchmark_cyclictest_with_env
> 
> Hi Daniel
> 
> Sorry for the late reply!!
> 
> > -----Original Message-----
> > From: fuego-bounces at lists.linuxfoundation.org
> > [mailto:fuego-bounces at lists.linuxfoundation.org] On Behalf Of Daniel 
> > Sangorrin
> > Sent: Friday, October 05, 2018 4:53 PM
> > To: Li, Xiaoming/李 霄鸣 <lixm.fnst at cn.fujitsu.com>; 
> > fuego at lists.linuxfoundation.org
> > Subject: Re: [Fuego] [PATCH] fix_Benchmark_cyclictest_with_env
> ...
> > > Yes, as you metioned, "sysctl -w kernel.sched_rt_runtime_us=-1"
> > > means no limitation and it can solve the problem, but it only make 
> > > sense if this test needs a full RT scheduling bandwidth.
> > >
> > >
> > >
> > > It's our miss that, actually, the cause of this issue doesn't have 
> > > any relation with hardware architecture.
> > >
> > > So, this patch may be inappropriate. It just becauses of our 
> > > target's environment configuration.
> > >
> > > (1) CONFIG_RT_GROUP_SCHED enabled.
> > >
> > > (2) all task cgroups have no runtime assigned.
> > >
> > >
> > >
> > > The methods to solve this problem we know are as following:
> > >
> > > 1) Disable "Cgroup scheduleing policy"
> > >
> > >      - unset "CONFIG_RT_GROUP_SCHED" in kernel config or Uboot:
> > > cgroup_disable=cpu,cpuset,memory.
> > >
> > > 2) Disable "RT scheduleing policy " :
> > >
> >
> > >      - set "sched_rt_runtime_us = -1"
> > >
> > > 3) Add the ssh connection process to a RT cgroup.
> > >
> > >
> > >
> > > About 1) and 2), we think both of them are not make sense(sorry 
> > > for out miss before).
> > >
> > > About 3), it's a more reasonable solution.
> > >
> > > But we didn't find some explicit group manipulation to check and 
> > > switch the RT cgroup. Could you give some hints?
> >
> > Yes, I think the best solution would be if you could run sshd in a 
> > cgroupd that has a real-time runtime (cpu.rt_runtime_us).
> >
> > Could you try something like this please?
> > https://www.freedesktop.org/wiki/Software/systemd/MyServiceCantGetRe
> > al
> > time/
> 
> Thanks for your guidance.
> I have checked this URL, then tried and found that all of those 3 
> solutions cannot work for us, because it seems that those 
> configurations of those solutions have been *removed*.
> 
> And we found another discussion/README[1] as below about this issue in 
> systemd community(maybe you have already seen this before).
> " We recommend to turn off Real-Time group scheduling in the
>   kernel when using systemd. RT group scheduling effectively
>   makes RT scheduling unavailable for most userspace, since it
>   requires explicit assignment of RT budgets to each unit whose
>   processes making use of RT. As there's no sensible way to
>   assign these budgets automatically this cannot really be
>   fixed, and it's best to disable group scheduling hence.
>      CONFIG_RT_GROUP_SCHED=n
> "
> 
> We don't know what should we do now.
> Maybe we can use the following method to fix this issue,
> - Check if CONFIG_RT_GROUP_SCHED is enabled before testing,
>   If yes, give some WARNING messages for testers.
>   If no, do this test normally.
> What do you think about this?

Putting a warning sounds good to me.

Maybe you can add some information about how to modify the board or the board file to solve the problem.
- If the user does not need the RT groups functionality
  - disable CONFIG_RT_GROUP_SCHED and compile the kernel again
  - sysctl -w kernel.sched_rt_runtime_us=-1
- If the user does want to use RT groups functionality
  - put the sshd daemon into a cgroup with assigned runtime

By the way, I thought that you could setup the board by overriding the ov_board_setup and ov_board_teardown functions on your board file (e.g.: using override-func). However, I don't see this code been used (deadcode!?) at the moment.
An alternative option could be to use the TARGET_SETUP_LINK and TARGET_TEARDOWN_LINK options.

Regards,
Daniel


> 
> [1] https://github.com/systemd/systemd/blob/master/README#L104
> 
> Best regards
> Li
> 
> > Thanks,
> > Daniel
> >
> > >
> > >
> > >
> > > No, we use root user account. But it seems that this issue is not 
> > > relation with what user we're using.
> > >
> > > About the real causes, I am not sure but I think the following 
> > > code seems can tell us why.
> > >
> > > ---
> > >
> > > /*
> > >
> > > * Do not allow realtime tasks into groups that have no runtime
> > >
> > > * assigned.
> > >
> > > */
> > >
> > > if (rt_bandwidth_enabled() && rt_policy(policy) &&
> > >
> > >          task_group(p)->rt_bandwidth.rt_runtime == 0 &&
> > >
> > >          !task_group_is_autogroup(task_group(p))) {
> > >
> > >          task_rq_unlock(rq, p, &rf);
> > >
> > >          return -EPERM;
> > >
> > > ---
> > >
> > >
> > >
> > > Best regards
> > >
> > > Li
> > >
> > >
> > >
> > > -----邮件原件-----
> > >
> > > 发件人: fuego-bounces at lists.linuxfoundation.org
> > > <mailto:fuego-bounces at lists.linuxfoundation.org>
> > > [mailto:fuego-bounces at lists.linuxfoundation.org] 代表
> > Tim.Bird at sony.com
> > > <mailto:Tim.Bird at sony.com>
> > >
> > > 发送时间: 2018年8月1日 6:00
> > >
> > > 收件人: daniel.sangorrin at toshiba.co.jp 
> > > <mailto:daniel.sangorrin at toshiba.co.jp> ; Li, Xiaoming/李 霄鸣;
> > > fuego at lists.linuxfoundation.org
> > > <mailto:fuego at lists.linuxfoundation.org>
> > >
> > > 主题: Re: [Fuego] [PATCH] fix_Benchmark_cyclictest_with_env
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > >
> > > > From: Daniel Sangorrin
> > >
> > > >
> > >
> > > > Hello Li,
> > >
> > > >
> > >
> > > > > -----Original Message-----
> > >
> > > > > From: Li, Xiaoming
> > >
> > > > >
> > >
> > > > >
> > >
> > > > > Details: When running on intel-up2, report an error of  
> > > > > "Unable to
> > >
> > > > > change scheduling policy! either run as root or join realtime group".
> > >
> > > > > As systemd defaults put ssh connections into a non-RT cgroup, 
> > > > > ssh
> > >
> > > > > in and
> > >
> > > > running
> > >
> > > > > FIFO/RR tasks won't work.
> > >
> > > This is useful information.  Thanks very much for researching this.
> > >
> > >
> > >
> > > I have a few questions.  Does Fuego use a non-root user account to 
> > > execute tests on your intel-up2 board?
> > >
> > >
> > >
> > > Do you know if boards that don't use systemd put ssh connections 
> > > into a non-RT cgroup?
> > >
> > >
> > >
> > > > > To fix this, add "sysctl -w kernel.sched_rt_runtime_us=-1".   It will
> > move
> > >
> > > > shell
> > >
> > > > > into the default cpu:/ group which does permit priority scheduling.
> > >
> > > This is strange.  Is it the kernel itself that is changing the 
> > > cgroup in response to this flag being set?  (It must be - I 
> > > wouldn't expect sysctl to do anything that complicated).
> > >
> > >
> > >
> > > >
> > >
> > > > That is a bit weird. There should be another way.
> > >
> > > > The purpose of sched_rt_runtime_us is to set what percentage of 
> > > > cpu is
> > >
> > > > left for non-rt tasks in case an rt-task hogs the cpu. It has 
> > > > nothing
> > >
> > > > to do with permissions hummm.
> > >
> > >
> > >
> > > I agree with Daniel here.  That flag can potentially upset the 
> > > realtime configuration of groups throughout the whole system.  Is 
> > > cyclictest assumed to be an invasive test? - that is, one that can 
> > > potentially disrupt the overall realtime configuration of the sytem?
> > > I haven't used it a lot, but I assume that by default cyclictest 
> > > assumes
> > that it has the full RT scheduling bandwidth of the system.
> > >
> > > In that regard, your fix makes sense.
> > >
> > >
> > >
> > > > # Also you are not checking if the target board is actually using systemd.
> > >
> > > > # kernel.sched_rt_runtime_us=-1 is actually a good setting for 
> > > > testing
> > >
> > > > (maybe not production) the rt performance with cyclictest What 
> > > > OS are
> > >
> > > > you using?
> > >
> > > >
> > >
> > > > Thanks,
> > >
> > > > Daniel
> > >
> > > >
> > >
> > > > > It looks like even the console is restricted. We have only 
> > > > > tested it
> > >
> > > > > on x86-64
> > >
> > > > and
> > >
> > > > > arm-64. This may only occurs on x86-64,  arm_64 has no such
> > restriction.
> > >
> > > > >
> > >
> > > > > Signed-off-by: Li Xiaoming <lixm.fnst at cn.fujitsu.com
> > > <mailto:lixm.fnst at cn.fujitsu.com> >
> > >
> > > > > ---
> > >
> > > > >  engine/tests/Benchmark.cyclictest/fuego_test.sh | 11
> > > > > +++++++++++
> > >
> > > > >  1 file changed, 11 insertions(+)
> > >
> > > > >
> > >
> > > > > diff --git a/engine/tests/Benchmark.cyclictest/fuego_test.sh
> > >
> > > > > b/engine/tests/Benchmark.cyclictest/fuego_test.sh
> > >
> > > > > index 319f361..edf37c8 100755
> > >
> > > > > --- a/engine/tests/Benchmark.cyclictest/fuego_test.sh
> > >
> > > > > +++ b/engine/tests/Benchmark.cyclictest/fuego_test.sh
> > >
> > > > > @@ -16,5 +16,16 @@ function test_deploy {  }
> > >
> > > > >
> > >
> > > > >  function test_run {
> > >
> > > > > +    LOG_TMPFILE="$LOGDIR/sched_rt_runtime_us"
> > >
> > > > > +
> > >
> > > > >
> > >
> > > >
> BOARD_TMPFILE="$BOARD_TESTDIR/fuego.$TESTDIR/sched_rt_runtime_us
> > >
> > > > "
> > >
> > > > > +    if [ $ARCHITECTURE == "x86_64" ]; then
> > >
> > > > > +        cmd "cat /proc/sys/kernel/sched_rt_runtime_us >
> > >
> > > > > + $BOARD_TMPFILE;
> > >
> > > > \
> > >
> > > > > +            sysctl -w kernel.sched_rt_runtime_us=-1";
> > >
> > > > > +        get $BOARD_TMPFILE $LOG_TMPFILE;
> > >
> > > > > +    fi
> > >
> > > > >      report "cd $BOARD_TESTDIR/fuego.$TESTDIR; ./cyclictest
> > >
> > > > > $BENCHMARK_CYCLICTEST_PARAMS"
> > >
> > > > > +    if [ $ARCHITECTURE == "x86_64" ]; then
> > >
> > > > > +        cmd "sysctl -w kernel.sched_rt_runtime_us=$(cat 
> > > > > + $LOG_TMPFILE)";
> > >
> > > > > +    fi
> > >
> > > > > +
> > >
> > > > >  }
> > >
> > > > > --
> > >
> > > > > 2.7.4
> > >
> > >
> > >
> > > The structure of the fix is OK, based on the desire to save the 
> > > value of sched_rt_runtime_us, set it to -1 during the test, and 
> > > then restore
> > it.
> > >
> > >
> > >
> > > But this has nothing to do with cgroups.  My preference would be 
> > > to use a mechanism that detected that a process was running in an 
> > > non-RT cgroup, warn the user about that, and then switch to the 
> > > cpu:/ group automatically, using some explicit group manipulation, 
> > > rather than using the side effect of setting sched_rt_runtime_us 
> > > to switch the
> cgroup.
> > >
> > >
> > >
> > > I would rather control the change in the patch (setting 
> > > sched_rt_runtime_us to -1) with a test parameter (specified in the 
> > > spec file or board file) that indicates to allow cyclictest to 
> > > have
> > unlimited RT runtime scheduling.
> > >
> > >
> > >
> > > That is, something like this:
> > >
> > > In the spec:
> > >
> > >   "unlimited_rt_runtime":"true"
> > >
> > > or board file:
> > >
> > > BENCHMARK_CYCLICTEST_UNLIMITED_RUNTIME="true"
> > >
> > >
> > >
> > > and in fuegotest.sh:
> > >
> > > if [ "$BENCHMARK_CYCLICTEST_UNLIMITED_RUNTIME" = "true" ] ; then ...
> > >
> > >
> > >
> > > If this is something that is needed by lots of different RT tests, 
> > > we may want to define a new board variable, that is independent of 
> > > this
> > specific test (Benchmark.cyclictest).
> > >
> > >
> > >
> > > Let me know what you think.
> > >
> > > -- Tim
> > >
> > >
> > >
> > > _______________________________________________
> > >
> > > Fuego mailing list
> > >
> > > Fuego at lists.linuxfoundation.org
> > > <mailto:Fuego at lists.linuxfoundation.org>
> > >
> > > https://lists.linuxfoundation.org/mailman/listinfo/fuego
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > _______________________________________________
> > Fuego mailing list
> > Fuego at lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/fuego
> >
> 
> 








More information about the Fuego mailing list