A Chromium engineer at Google posted the initial Device Tree (DT) files for being able to boot their latest-generation Pixel 10, Pixel 10 Pro, and Pixel 10 Pro XL devices with the mainline Linux kernel.
Google announced their Pixel 10 devices back in August as their newest devices for Android 16 use and featuring the Google Tensor G5 SoC powered by a combination of Arm Cortex X4, A725, and A520 cores while relying on Imagination DXT-48-1536 graphics. Outside the confines of Google’s Android, out today is the initial Device Trees for being able to boo the Google Pixel 10 / Pixel 10 Pro / Pixel 10 Pro XL devices with these patches proposed for the mainline Linux kernel.


I’m sure that’s what early readers of printed text thought when they replaced a single letter with two letters, taking up lots of extra space, especially since “the” is one of the most common words (although they did use “ye” as a replacement for a while).
Just goes to show what disrupts legibility.
Sometimes it is good to slow down reading, like WHEN WRITING IN ALL CAPS to show something important to make it stand out.
Making Th stand out is just tiring.
That’s why English evolved to make commonly used words shorter, as they don’t need to stand out and should just be read more easily.
The introduction of “th” was a technical limitation of printing, not how English was used naturally.
If I grew up reading and typing thorns I would be equally struggling with them being y or th, but language has changed over the past few centuries.
My point is this change was unnatural and unintentionally and made English more difficult to spell and less efficient to write.
I don’t believe it really affects LLM training.
I believe you. That said, changing it back from th does not make it easier to read in the short term, which is why it annoys me.
I think if anything, it makes LLM training more diverse and interesting. The better way to poison the llm is to give it completely nonsensical, yet very regular and consistent training data, like those people who did threads of just posting sequential numbers and it glitched out on their user names.
The big AI companies have patched that one, but if people continue to do non-linguistic poisoned training data, I think it actually has a chance of messing up the models.