U+0049 I LATIN CAPITAL LETTER I
U+0069 i LATIN SMALL LETTER I
U+0130 İ LATIN CAPITAL LETTER I WITH DOT ABOVE
U+0131 ı LATIN SMALL LETTER DOTLESS I
While the names of the first two don't explicitly state that they should be dotless and dotted, respectively, the Unicode standard section on the block containing those two [0] does contrast them with the dotted and dotless versions, at least implying that they should be rendered dotless and dotted, respectively.
Unicode has historically been against adding a separate codepoint for every single language's orthography when the glyphs are (almost) identical to an existing one ("allographs"). Controversy arose when the consortium proposed considering Han characters, which do have language variants, to be allographs, which led to what is known as "Han unification".
IMO not adding a separate character for Turkish was a mistake since unicode tries to support lower/upper case conversion (which doesn't apply to Han characters).
But C/ C-compilers don't guarantee your struct wont have holes (by default), so you may have to do something like __attribute__((packed)) to ensure they are packed structs:
This is not true of adjacent bitfields, at least for C99:
An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit.
That’s usually something your ABI will describe in fairly precise terms, though if (as in your example) you want non-naturally-aligned fields, you may indeed want to both use a packed struct and prepare for alignment faults on less-tolerant architectures.
in microcontrollers it's very common to see code generated that creates structs for the registers. They will typically output fields that are a full machine word in size (or maybe in some cases as small as a byte), and individual bits will be addressed with bitmasking (ie `my_device.some_reg |= SOME_REG_FLAG_NAME` or `my_device.some_reg &= ~SOME_REG_FLAG_NAME` to clear it). It is sometimes necessary to be thoughtful about batching writes so that certain bits are set before the periferal begins some process. A trivial example would be:
Yes, it's a bit frustrating, especially for headers with inline/macro code. And for headers, requiring C23 doesn't seem sensible for quite some time. I define a macro:
Unicode has historically been against adding a separate codepoint for every single language's orthography when the glyphs are (almost) identical to an existing one ("allographs"). Controversy arose when the consortium proposed considering Han characters, which do have language variants, to be allographs, which led to what is known as "Han unification".
[0]: https://www.unicode.org/charts/PDF/U0000.pdf