LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Bedrock Linux (https://www.linuxquestions.org/questions/bedrock-linux-118/)
-   -   How does BRL hand off booting to hijacked os? (https://www.linuxquestions.org/questions/bedrock-linux-118/how-does-brl-hand-off-booting-to-hijacked-os-4175657859/)

grumbly 07-22-2019 01:08 PM

How does BRL hand off booting to hijacked os?
 
My os isn't supported (Solus), so I'm trying to debug and solve this myself, but could use some pointers from anyone knowledgeable.
I've updated my Solus kernel, but brl is still booting the old one. All my config files in /boot and /usr/lib/kernel of Solus itself indicate the newer kernel should boot. Lots of old posts discuss grub and grub config in brl, but there seems to be no grub or grub config anymore?
Thanks.

Bedrock Linux v 0.7
kernel change in Solus from 4.20.16-112 to 5.1.14-121
/boot/loader/entries/:
Solus-current-4.20.16-112.conf
Solus-current-5.1.14-121.conf

ParadigmComplex 07-22-2019 08:37 PM

Boot process on a traditional distro usually goes something like:
  1. motherboard stuff (BIOS/UEFI, POSTing, etc) which picks which harddrive/partition to boot
  2. bootloader (e.g. GRUB) which picks which kernel/initrd to boot
  3. init system

On Bedrock Linux 0.7.x Poki, it looks like:
  1. motherboard stuff (BIOS/UEFI, POSTing, etc) which picks which harddrive/partition to boot
  2. bootloader (e.g. GRUB) which picks which kernel/initrd to boot
  3. bedrock meta-init which picks which init to use for the session
  4. init system

Bedrock doesn't pick the kernel; that's your bootloader, which runs before any Bedrock code runs.

The only potentially tricky thing here is that different things see different files in different places, as explained in the basic usage documentation. The typical Bedrock workflow is to have all bootloader related files to be global. Typically that's /boot, which Bedrock is configured by default to treat as global. See the [global] section of /bedrock/etc/bedrock.conf to get a concrete list of configured global file paths and use `brl which /path/to/file` to check at runtime whether Bedrock considers the file path global.

If your bootloader does not see/list the desired kernel, it could be because it is misconfigured or it is looking in some non-global path.

If you pick one kernel with your bootloader, but the bootloader loads another one, it sounds like the bootloader is just misconfigured

When we discussed Bedrock/Solus earlier I found Solus on Bedrock was misconfiguring the bootloader earlier on kernel updates. Whatever fix you did for that issue might have been inadequate.

grumbly 07-24-2019 11:53 AM

As always, thanks so much for the detailed feedback and description of booting. I think the one thing that I know is different is there is no GRUB in UEFI-installed Solus. But I've no clue what is in its place even though there clearly is something. I'm getting pointers from the Solus community here, but as yet still no luck. https://discuss.getsol.us/d/1779-how...ter-updating/7

ParadigmComplex 07-24-2019 12:14 PM

Happy to help :)

I do know Bedrock has been reported to work on all sorts of bootloader related setups, including not only traditional BIOS but also newfangled UEFI and even non-x86 stuff. This is probably solvable if you can figure out what the bootloader setup is actually doing and ensure it's looking in either global (e.g. /boot) or cross (e.g. /bedrock/strata/solus/usr/) paths.

grumbly 07-24-2019 02:35 PM

Ah, I figured it out. If you're curious, /boot is not something Solus uses, but it had stuff in it in my case. Maybe this is something I did and I just don't remember, or maybe it's something BRL did? I read the clr-boot-manager source code to discover how it interfaced with boot loader configs, discovered UEFI solus uses systemd-boot, then I had to manually mount my EFI partition. Using the systemd tools directly and editing conf files and manually copying kernel/initrd files over I was able to successfully boot the new kernel. The only annoying this is my EFI partition is so tiny I had to remove my old kernel files to make room, since each one takes up about 1/3 the space on the partition. I learned a lot through this. Thanks again for pointing me in the right direction!

ParadigmComplex 07-24-2019 03:08 PM

Very happy to hear you have it figured it.

Sadly, I know relatively little about EFI; I've been procrastinating figuring out in favor of just continuing to use legacy BIOS stuff. Any ideas on why this just happens with Bedrock, and not traditional Solus? Any ideas on what Bedrock could do to make this "just work" going forward?

grumbly 07-24-2019 03:22 PM

Yes, I was thinking about this.
I have a lot of speculations, but what I would do now is do another install of solus and try to replicate all of this. I'd like to verify if indeed out-of-box behavior is to cause solus to be confused or if that is something I did, because I can't confidently say one way or the other.
I don't know if a virtual install would let me see this or not - will have to check this weekend.

grumbly 07-25-2019 12:22 PM

I verified that the BRL - Solus interaction is definition causing this problem. I created a virtual install of solus and tried several scenarios using checkpointing for diverging steps. The issue seems to be that Solus has a /boot dir, but doesn't use it for anything. The BRL installer makes some change that confuses Solus into thinking the /boot dir is now the appropriate place for all EFI files, instead of mounting the ESP-boot flagged partition and using that one, which it does fine without BRL installed. So I'd look into what change the BRL installer is making that might make Solus think /boot is relevant at all. OR, since /boot is not being used at all, the BRL installer could affect a change to the Solus install such that Solus mounts the ESP-boot flagged partition on startup without a mountpoint of /boot, then not worry about confusing Solus, because /boot is now the right place anyway.

ParadigmComplex 07-25-2019 12:55 PM

On typical distros, the software which mounts /etc/fstab contents first checks if any specified directory is already a mount point. If it is, it skips it; otherwise, it mounts the specified item there. Bedrock is all about managing what file is returned when a given file path is accessed. It does so largely by manipulating mount points with special properties. These two facts do not align well. When Bedrock hands control of to the specified init system, that init system's /etc/fstab mounting code may see a Bedrock created mount point and skip various /etc/fstab mount lines. Bedrock works around this by mounting /etc/fstab itself before handing control off to the specified init system.

Bedrock's code to mount /etc/fstab itself is limited. There's already a known issue about how it handles global LVM mounts. (Note that full disk LVM, such as is used by full disk encryption, works fine in Bedrock. This is only a problem with global directories, and the root directory is local.) It would not be surprising if the same thing is going on here with EFI.

Can you figure out how Solus mounts /boot normally? If it's /etc/fstab, can you get me the corresponding /etc/fstab contents?

Assuming the above guess is correct, you can work around this by figuring out how to mount efi manually, then configuring your init to mount it on /boot like it normally would be (over the Bedrock created mount on /boot).

I have two broad strategies for Bedrock to handle LVM, which presumably could be extended for EFI, assuming this is what's going on:
  1. Add code to Bedrock's core to handle LVM (and EFI).
    • This has the downside of making Bedrock's core larger
    • This has the upside of not being reliant on other strata so that you're free to remove them.
  2. Get Bedrock to figure out the catch-22 of using a stratum's LVM (and EFI) mounting tools before that stratum is set up.
    • This makes the system dependent on a non-bedrock stratum
    • Catch-22 resolution could get ugly.

Resolving this for LVM has been on my to-do list for a while, but I've been delaying it in favor of more pressing things. I'll probably get to it eventually. If we can confirm Solus/EFI has the same underlying cause, I can incorporate that into my LVM fix when I get around to it. If anyone wants to pursue getting LVM and EFI stuff into the bedrock stratum, that's probably something that can be delegated. If so, I recommend looking at how Bedrock handles netselect in its build system as a reference.

grumbly 07-25-2019 01:12 PM

Ah, that makes sense as a likely cause.
My fstab is pretty spartan, so unfortunately not that helpful. I'll see what I can find out from the Solus community.
Code:

# /etc/fstab: static file system information.
#
# <fs>      <mountpoint> <type> <opts>      <dump/pass>

#*/dev/ROOT  /            ext3    noatime        0 1
#*/dev/SWAP  none        swap    sw            0 0
#*/dev/fd0    /mnt/floppy  auto    noauto        0 0
none /proc proc nosuid,noexec 0 0
none /dev/shm tmpfs defaults 0 0
# /dev/nvme0n1p5 at time of installation
UUID=46504775-baed-4982-80a9-d766f2dd0313 / ext4 rw,relatime,errors=remount-ro 0 0


grumbly 07-25-2019 01:17 PM

Shoot, as I suspected it's in the clr-boot-manager source code.
From [https://github.com/kyrios123/solus-efi-guide]
Quote:

You may have noticed that /boot is not mounted when Solus is started. This is because clr-boot-manager only mounts this filesystem the time it's required and unmounts it immediately afterwards. This adds an extra level of safety. The drawback is that the few software that may have to legitimately interract with it, should be integrated with clr-boot-manager and this piece is not yet in place. See fwupd or rEFInd.
I like your solution #1.

Manual or fstab mounting of the correct ESP partition (there can be multiple) is easy so manual workaround is not an issue at the moment.

grumbly 07-25-2019 02:12 PM

The relevant mount_boot() fn is here [https://github.com/clearlinux/clr-bo...ootman.c#L397] and clearly enough commented.
It first checks if /boot is mounted, and if so proceeds with the EFI files update into /boot.
If not, then it mounts it first.

cbm_is_mounted [https://github.com/clearlinux/clr-bo.../files.c#L383]
mount is provided by <sys/mount.h> ( return system_ops->mount(source, target, filesystemtype, mountflags, data) )

ParadigmComplex 07-25-2019 03:34 PM

Nice research :)

I'm guessing this

Quote:

You may have noticed that /boot is not mounted when Solus is started. This is because clr-boot-manager only mounts this filesystem the time it's required and unmounts it immediately afterwards. This adds an extra level of safety. The drawback is that the few software that may have to legitimately interract with it, should be integrated with clr-boot-manager and this piece is not yet in place. See fwupd or rEFInd.
to protect users who rm -rf --no-preserve-root /.

Given the transient mount not described in /etc, the solution here is going to be different from either LVM solution I described above.

As a local work around for just your system, you could probably just remove /boot from /bedrock/etc/bedrock.conf's [global]/share section. However, we can't use that as a generalized solution, as it should be shared for other workflows.

It might be possible for Bedrock to dynamically detect this scenario and conditionally determine if /boot should be shared, but that's a very finicky proposition. While I generally I prefer to fix things in Bedrock itself rather than push patches on other projects, it might be best to try to upstream a patch to clr-boot-manager to extend the pre-mount check you've found to check if an existing mount point found is actually the EFI stuff desired. Hopefully they're amenable to it; I don't see why they wouldn't be.

If you or anyone else who finds this thread feels sufficiently confident in their ability to upstream such a change, you're more than welcome to try. Otherwise I can add this to my to-do list, although it'll be quite some time before I get to it.

grumbly 07-25-2019 03:56 PM

Quote:

Originally Posted by ParadigmComplex (Post 6018619)
Nice research :)

Thanks!

Quote:

Originally Posted by ParadigmComplex (Post 6018619)
it might be best to try to upstream a patch to clr-boot-manager to extend the pre-mount check you've found to check if an existing mount point found is actually the EFI stuff desired. Hopefully they're amenable to it; I don't see why they wouldn't be.

That's a good idea; I'll take a stab at it.

ParadigmComplex 07-25-2019 04:28 PM

Good luck, and keep me in the loop!


All times are GMT -5. The time now is 10:01 PM.