[PATCH v2] btrfs: fix rw device counting in __btrfs_free_extra_devids

Desmond Cheong Zhi Xi desmondcheongzx at gmail.com
Fri Aug 20 17:53:48 UTC 2021


On 20/8/21 6:58 pm, David Sterba wrote:
> On Fri, Aug 20, 2021 at 11:09:05AM +0800, Desmond Cheong Zhi Xi wrote:
>> On 20/8/21 1:34 am, David Sterba wrote:
>>> On Fri, Aug 20, 2021 at 01:11:58AM +0800, Desmond Cheong Zhi Xi wrote:
>>>>>>> The option #2 does not sound safe because the TGT bit is checked in
>>>>>>> several places where device list is queried for various reasons, even
>>>>>>> without a mounted filesystem.
>>>>>>>
>>>>>>> Removing the assertion makes more sense but I'm still not convinced that
>>>>>>> the this is expected/allowed state of a closed device.
>>>>>>>
>>>>>>
>>>>>> Would it be better if we cleared the REPLACE_TGT bit only when closing
>>>>>> the device where device->devid == BTRFS_DEV_REPLACE_DEVID?
>>>>>>
>>>>>> The first conditional in btrfs_close_one_device assumes that we can come
>>>>>> across such a device. If we come across it, we should properly reset it.
>>>>>>
>>>>>> If other devices has this bit set, the ASSERT will still catch it and
>>>>>> let us know something is wrong.
>>>>>
>>>>> That sounds great.
>>>>>
>>>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>>>> index 70f94b75f25a..a5afebb78ecf 100644
>>>>>> --- a/fs/btrfs/volumes.c
>>>>>> +++ b/fs/btrfs/volumes.c
>>>>>> @@ -1130,6 +1130,9 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>>>>                     fs_devices->rw_devices--;
>>>>>>             }
>>>>>>      
>>>>>> +       if (device->devid == BTRFS_DEV_REPLACE_DEVID)
>>>>>> +               clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
>>>>>> +
>>>>>>             if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
>>>>>>                     fs_devices->missing_devices--;
>>>>>
>>>>> I'll do a few test rounds, thanks.
>>>>
>>>> Just following up. Did that resolve the issue or is further
>>>> investigation needed?
>>>
>>> The fix seems to work, I haven't seen the assertion fail anymore,
>>> incidentally the crash also stopped to show up on an unpatched branch.
>>>
>>
>> Sounds good, thanks for the update. If there's anything else I can help
>> with, please let me know.
> 
> So are you going to send the patch with the fix?
> 

Right, just sent. For some reason I thought it was already patched.


More information about the Linux-kernel-mentees mailing list