Overlayfs @Plumbers

Amir Goldstein amir73il at gmail.com
Sun Aug 30 11:37:54 UTC 2020


On Sat, Aug 29, 2020 at 6:41 PM Sargun Dhillon <sargun at sargun.me> wrote:
>
> On Fri, Aug 28, 2020 at 8:59 AM Aleksa Sarai <asarai at suse.de> wrote:
> >
> > On 2020-08-28, Eric W. Biederman <ebiederm at xmission.com> wrote:
> > > Amir Goldstein <amir73il at gmail.com> writes:
> > >
> > > > Hi Guys,
> > > >
> > > > It's been nice to virtually meet with you yesterday.
> > > > Some of you wanted to follow up on overlayfs related issues.
> > > >
> > > > If you want to discuss, try to find me in one of the
> > > > https://meet.2020.linuxplumbersconf.org/hackrooms
> > > > today between 16:00-17:00 UTC
> > > > (No need to enter the room to see who's inside)
> > > >
> > > > If those times do not work for you, contact me and we can try
> > > > to schedule another time.
> > >
> > > Did this conversation wind up happening?  Do we need to reschedule?
> >
> > This conversation already happened in a Hackroom on Tuesday. I'm not
> > sure if the Hackrooms will have their recordings published, so maybe
> > Amir can post any of the takeaways we had?
> >
> > --
> > Aleksa Sarai
> > Senior Software Engineer (Containers)
> > SUSE Linux GmbH
> > <https://www.cyphar.com/>
>
> I unfortunately missed this conversation. I wanted to bring up OverlayFS, and
> ephemeral upper dirs. We use overlayfs with Docker containers, and we waste
> a lot of time on writing things back to disk.
>
> We're not so peeved about the fact that OVL does any sync operations, as that's
> what our users have been used to. The big problem is on unmount, ovelfs decides
> syncing the upperdirs is a good idea. IIRC, this regression was
> introduced somewhere
> in the 4.X series.
>
> We've been carrying a patch to short-circuit this behaviour for a while now:
> https://github.com/Netflix-Skunkworks/linux/commit/edb195d9b73cc22d095078010a14a690f41ee253
>
> I know that this behaviour (and any behaviour that short-circuits
> O_SYNC / FUA is
> technically "wrong", but in this case, can we make an exception? I originally
> thought about using device mapper to remove the FUA bit from all BIOs, but it
> turns out that my underlying storage *always* persists data to disk,
> so every write
> takes...a long time.
>
> Amir, what's your take?

It's not only FUA that is causing slow down.
syncfs() takes internal filesystem locks (e.g. to commit a journal transaction),
so it causes interference with other writers on the same underlying filesystem.

As Giuseppe pointed out, a patch has already been submitted to address
this issue.

Thanks,
Amir.


More information about the Containers mailing list