Homme assistant Yellow Installation issue

I just received my yellow home assistant (Kit with power supply).

I followed the guide https://yellow.home-assistant.io/

I installed a CM4 Lite (Wirteless / 4G RAM) and a 256 GB NVMe SSD.
I installed the HA OS apparently with no problem: blinking light…I saw a “yellow installer” host connected on my LAN
The whole installation was OK until step 13…

Now in step 13, when I plug the power back in, the green led flashes once, then immediately the red led and the green led stay on.
The LAN light is also green but I don’t see any DHCP activity on my router.

I tried a new OS installation again with the USB drive. But now nothing happens.

It looks like I have a brick with one green led and one red led on. … What can I do ?

1 Like

Hi,
Unfortunately, as not many folks have Yellow devices the community doesn’t had time to find out details like recovery:

  • Isn’t there a button combination in the docs to force a wipe and re-install?
  • The USB-C connector on the Yellow board has a jumper which suggests it might operate as a serial UART giving a serial console - this might tell you what is going on, and even a root shell to fix it.
    The RPi foundation have very detailed documentation on the CM4 so the boot order and diagnostics might help here, especially for a CM4 Lite (which doesn’t have any eMMC storage).

As a *nix greybeard whose Yellow is currently moving the datadisk from eMMC to NVMe SSD, the only other suggestion I can offer is to buy a USB-3 M.2 dock and try mounting the SSD on a PC.

With the SSD in a dock, you could try:

  • Seeing if the SSD works, and has HASS partitions (e.g. useful for recovery of data)
  • Wiping the start of the SSD (to remove the /boot partition - which might be getting priority over an external USB boot stick with HASSOS and booting with a damaged SSD install)
  • Installing HASSOS directly (e.g. via the Raspberry Pi flasher tool) to the SSD

But, TBH, I’m just guessing using Linux and RPi knowledge.

1 Like

Thanks. @FloatingBoater .
For the moment I have not an USB dock. but I’m going to get one.
I have used the serial console:
By removing the SSD I boot on the usb stick OS installer and I have lots of messages on the console.
Starting with the SSD not a single message on the console.
I have another CM4 (with EMMC storage) .Perhaps, I may try this card before I receive the nvme dock …

My Yellow unit does not boot because it has a date six months ago, so it does not trust the TLS certificate from the update server, which is too new. It tries to set the date, but I guess it fails.
I opened the unit, looking for a serial console and saw your comments about USB-C connector, and low and behold it sure looks like it is a serial console, with a jumper.
So I plugged it into my USB serial console server, ironically, a Olivex box running FreedomBox, which also promised to change my world, but usually, when I want to use it, has crashed…

so far, nothing… no new USB device shows up.
There is a second jumper, marked “USB-C rcvry” which I tried moving too.

(apparently I can only upload one image at a time)

here is the picture of the USB/UART jumper which, was really obvious to me once it was mentioned.

The message over here: I'm unhappy with the removal of GPIO - #304 by jkk says that the first ten pins are there, with a serial port. Could it be a serial console?

I used the serial console with the first jumper in the UART position (It was the default position) and a USB-C cable connected to a Windows PC. I use putty on the PC (speed 115200) .

Not sure which way this 10pin header is arranged… I didn’t get anything either way.

22-03-11 07:29:06 WARNING (MainThread) [supervisor.utils.whoami] Whoami service failed with SSL verification: Cannot connect to host services.home-assistant.io:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate is not yet valid (_ssl.c:1129)')]
22-03-11 07:29:06 WARNING (MainThread) [supervisor.core] System time/date shift over more as 7days found!
22-03-11 07:29:06 INFO (MainThread) [supervisor.host.control] Setting new host datetime: 2022-10-12T21:57:08+00:00
22-03-11 07:29:06 CRITICAL (MainThread) [supervisor.core] Fatal error happening on load Task <coroutine object Core._adjust_system_datetime at 0x7f8f9aa240>: DBus INT32 type "i" must be between -2147483648 and 2147483647
22-03-11 07:29:06 INFO (SyncWorker_0) [supervisor.docker.interface] Attaching to ghcr.io/home-assistant/aarch64-hassio-cli with version 2022.06.0

The core problem is that my unit fails to upgrade because the clock is wrong.
(Certificates dates SHOULD be ignored for initial upgrades. DNSSEC should be enabled, but ignore times as well)
There is some code to attempt to update the time, but it does not seem to do it right.

That’s very weird - could the CR2032 RTC battery be dead? You’d hope the RTC would get updated using NTP pretty early on in the boot process.

certificate is not yet valid

Yeah - you’re right - that shouts the RTC is back at the epoch (1970ish?) so the cert is way in the future. Still don’t see why NTP over Ethernet doesn’t fix it (unless the battery/ holder/ hardware has failed).

My guess is the USC-C has a FTDI clone on it - jacme has it working with a WinPC which would fit. The first row of Yellow jumper pins are supposed to be the same as a RPi so allow the use of serial boards like the RaZberry Z-Wave interface.

I always used “real” DB25/DB9 RS232/485 console servers (ideally with light blue Cisco cables!), so don’t know what the Olivex exposes (I use NextCloud, Cockpit for similar reasons) but it looks like a Debian box, so would expect something miniterm.py /dev/ttyUSB0 115300 would work.

You’ve clearly managed to get a serial console either via USB or 3-jumpers - does CTRL-D CTRL-C bring up a prompt, or is the port just a logging console without a TTY?

I use the same configuration and experience the same issue. After removing the NVMe SSD the yellow boots as it should.

I return to my initial problem on a CM4 lite. I have now a USB Nvme dock.

Recall of the issue : After an apparently successful execution of the yellow installer on my Nvme SSD, the boot does not work and the console output does not report any message.

I put my SSD in my USB dock and connected it to a linux system. I saw on this SSD several HASS partitions including the HASS boot.
I deleted all partitions and put the SSD back on my Yellow
machine and restarted the installation procedure with a console.

The install does flash the system to the nvme disk. Everything seems OK except for the some GPT errors which I don’t understand much about.
Here the log

[   68.710339] haos-flash[259]: Getting latest Home Assistant OS version from channel stable...
[   75.349629] haos-flash[259]: Installing Home Assitant OS 9.2 to nvme0n1.
[  241.979682] GPT:Primary header thinks Alt. header is not at the end of the disk.
[  241.987160] GPT:4194303 != 488397167
[  241.990742] GPT:Alternate GPT header not at the end of the disk.
[  241.996774] GPT:4194303 != 488397167
[  242.000375] GPT: Use GNU Parted to correct GPT errors.
[  242.005572]  nvme0n1: p1 p2 p3 p4 p5 p6 p7 p8
[  242.010319] haos-flash[268]: 0+261956 records in
[  242.010924] haos-flash[268]: 0+261956 records out
[  242.013112] haos-flash[259]: Successfully installed Home Assistant OS, shutting down the system

Finally the problem is still there. My yellow machine refuses to boot and reports no message

The partition table appears to have several problems - hence the suggestion from Linux to use the Parted tool to fix them.

That may well work, but I’ve wiped at least the start of the SSD completely and started from scratch in similar situations. Deleting partitions is like erasing writing on lined paper - the lines are still there.

DBAN, GNU Parted tool, or simple fdisk are all alternatives to this process.

If you’re comfortable with the Linux command line, (RPi or even a live distro booted from a USB drive), then the process can be quick. The hard part is ensuring you wipe the CORRECT DEVICE and not the host machine! You want to wipe the whole device (e.g. /dev/name) not just one partition (e.g. /dev/name1, /dev/name2… ).

If this seems complex - it is. Stop now, before you break something. :mage:t2:

Monitor the logs to check which is the SSD device, then zero the start of the storage:

$ sudo journalctl -f
# insert the SSD, check the device name
# you should umount the partitions, but this sometimes removes the device completely
# N.B. get this wrong, and the command will WIPE your disk - be very careful
# you have checked you backups have off-line, off-site backups?
$ sudo dd if=/dev/zero of=/dev/<SSD DEVICE> bs=4096 1024

Thanks for your advices.

I wiped the device with dd (very long time …), then flashed again with yellow installer : same behaviour. Impossible to boot with the NVME installed on the board and also imposible to boot with the NVME SSD in the USB Dock. (used as an USB stick ).

After that, I made several tries :

With Raspberry Pi Imager :

  • I Installed directly HAOS 9.2 on the SSD : No Boot !
  • I installed the « yellow installer » on the SSD : The boot was successufull
  • I installed directly Raspberry PI OS Lite on the SSD : The boot was successufull
  • I installed directly HAOS 9.2 on a normal USB stick (not the SSD) : No boot !

For the moment my conclusion is that the current HAOS image provided by the Raspberry PI imager and by the yellow installer is not good and unbootable !

But in this case i would not be alone to have problems …

One thing is sure. The partitions installed by the Imager using HAOS or OS Lite are very differents, compared with the command « parted » :
HAOS image

Modèle: CT250P2S SD8 (scsi)
Disque /dev/sde : 250GB
Taille des secteurs (logiques/physiques): 512B/512B
Table de partitions : gpt
Drapeaux de disque : pmbr_boot

Numéro  Début   Fin     Taille  Système de fichiers  Nom               Fanions
 1      1049kB  34,6MB  33,6MB  fat16                hassos-boot       msftres
 2      34,6MB  59,8MB  25,2MB                       hassos-kernel0
 3      59,8MB  328MB   268MB                        hassos-system0
 4      328MB   353MB   25,2MB                       hassos-kernel1
 5      353MB   622MB   268MB                        hassos-system1
 6      622MB   630MB   8389kB                       hassos-bootstate
 7      630MB   731MB   101MB   ext4                 hassos-overlay
 8      731MB   2073MB  1342MB  ext4                 hassos-data

OS LITE image

Modèle: CT250P2S SD8 (scsi)
Disque /dev/sdd : 250GB
Taille des secteurs (logiques/physiques): 512B/512B
Table de partitions : msdos
Drapeaux de disque : 

Numéro  Début   Fin    Taille  Type     Système de fichiers  Fanions
 1      4194kB  273MB  268MB   primary  fat32                lba
 2      273MB   250GB  250GB   primary  ext4

As mentioned I experience the same issue and spend hours trying to find a cause. I hope the people of Nabu Casa will jump in.

1 Like

When you say HAOS, what image do you mean? The reason I ask is there is a specific image, separate from the RasPi images for Yellow. No idea what differences there are, but I assume that is the image that the Yellow installer is pulling. See all the images here: Releases · home-assistant/operating-system · GitHub

Yes it was not a good image for yellow.

I began a discussion on the discord on this issue with people of nabu casa.
I learned that you can install the yellow installer on the nvme.

You can boot the Installer from NVMe, it will run in memory, and overwrite itself with the correct HAOS image for Yellow.

In my case the yellow installer boots on the nvme ans install well an HAOS image.
But the installed image cannot be booted …

The issue is still open.

1 Like

Confused - this is a CM4 Lite (no eMMC) and a NVMe SSD which has been zeroed with dd, and the install still failed!

That only leaves the CM4 Lite firmware or HASSOS (which probably has had more testing on CM4 with 2Gb of eMMC - part of the kit).

Can you boot with RPi OS on USB? This includes a firmware update tool (can’t remember if it is fpwupd?) for CM4, but this is clutching at straws unless there’s hardware troubles with the SSD (odd incompatibility?).

The first partition list looks very similar to my CM4 2Gb eMMC layout - boot, 2x kernel, data. Moving the datadisk changes hassos-data-old, and moves hassos-data.

Not sure what the second partition list is - fat32 is sometimes used for boot, but one large unlabeled ext4 is odd. Was that a wierdly resized install image perhaps?

An interesting post mentions the wrong image disabled USB ports for power saving - as my Yellow was the kit, it was pre-installed, so not seen this personally.

Try a different install image? :man_shrugging:t2:

This isn’t probably only HA Yellow issue, but wider issue with (native?) nvme-devices and HA and to me, it has all the same symptoms I’m seeing with different nvme equipped carrier board with CM4.

I’m using Waveshare cm4-io-base-a, which is quite similar in specs to Yellow, especially its nvme part. My CM4 is 4Gb/16Gb version. Original installation with HASSOS 7.6 went ok, I actually first installed Raspbian into eMMC and used imager to install HASSOS in nvme, made some changes into config.txt (boot order and bt+wifi disable). HASSOS 7.6 booted ok, and IIRC every upgrade until HASSOS 8.0. worked properly.

After upgrade to HASSOS 8.0, no NVME boot, removing and emptying SSD and damn circus with restoring backup. It was probably HASSOS 8.2 that once again had working nvme, until HASSOS 9.0 was released. I wasn’t very surprised when it once again stopped booting, all the same remove-install-restore as with 8.0.

There’s seems to be quality control issue with nvme+CM4 combination. Every major version ending to zero (8.0, 9.0) stops working with nvme. When we get upgrde to something like x.1 or x.2, it may start working. But clearly, there’s clear pattern on broken nvme support! And it seems to have nothing to do with nvme drive itself, since it gets fixed after HASSOS update. Raspbian has absolutely no issues in writing image to nvme-disk.

1 Like