Hacker Newsnew | past | comments | ask | show | jobs | submit | dropbear3's commentslogin

There are the following codepoints:

    U+0049 I LATIN CAPITAL LETTER I
    U+0069 i LATIN SMALL LETTER I
    U+0130 İ LATIN CAPITAL LETTER I WITH DOT ABOVE
    U+0131 ı LATIN SMALL LETTER DOTLESS I
While the names of the first two don't explicitly state that they should be dotless and dotted, respectively, the Unicode standard section on the block containing those two [0] does contrast them with the dotted and dotless versions, at least implying that they should be rendered dotless and dotted, respectively.

Unicode has historically been against adding a separate codepoint for every single language's orthography when the glyphs are (almost) identical to an existing one ("allographs"). Controversy arose when the consortium proposed considering Han characters, which do have language variants, to be allographs, which led to what is known as "Han unification".

[0]: https://www.unicode.org/charts/PDF/U0000.pdf


IMO not adding a separate character for Turkish was a mistake since unicode tries to support lower/upper case conversion (which doesn't apply to Han characters).


> How would you do [packed structs] in C?

Bitfields! This is valid C:

  struct foo {
    unsigned bg_priority: 2;
    unsigned character_base: 2;
    // ...
  };


But C/ C-compilers don't guarantee your struct wont have holes (by default), so you may have to do something like __attribute__((packed)) to ensure they are packed structs:

    struct bitmap_file_header
    {
      UWord signature;
      UDWord file_size;
      UWord reserved_1;
      UWord reserved_2;
      UDWord file_offset_to_pixel_array;
    } __attribute__((packed));


This is not true of adjacent bitfields, at least for C99:

  An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf 6.7.2.1 § 10


That’s usually something your ABI will describe in fairly precise terms, though if (as in your example) you want non-naturally-aligned fields, you may indeed want to both use a packed struct and prepare for alignment faults on less-tolerant architectures.


There’s also a directive (don’t have code in front of me) that you can do at file level that will cause all subsequent struct defs to be packed…

#push(pragma(pack(0)) ??

I’ve done a lot of direct register access in C this way. I do like Zigs ability to just define the different sizes though.


It’s MS(V)C syntax, now supported by GCC and Clang as well:

  #pragma pack(push)
  #pragma pack(1)
  /* ... */
  #pragma pack(pop)
The first two lines can also be condensed into

  #pragma pack(push, 1)


in microcontrollers it's very common to see code generated that creates structs for the registers. They will typically output fields that are a full machine word in size (or maybe in some cases as small as a byte), and individual bits will be addressed with bitmasking (ie `my_device.some_reg |= SOME_REG_FLAG_NAME` or `my_device.some_reg &= ~SOME_REG_FLAG_NAME` to clear it). It is sometimes necessary to be thoughtful about batching writes so that certain bits are set before the periferal begins some process. A trivial example would be:

  port_a.data_out |= GPIOA_PIN_1 | GPIOA_PIN_2;
and

  port_a.pin1 = true;
  port_a.pin2 = true;


This is why manufacturers don't do this for volatile register access. You now have bloated, hazard prone code with multiple read-modify-writes.


{0} is standard C. {} is currently a (common) compiler extension but will be standard C23: https://open-std.org/JTC1/SC22/WG14/www/docs/n2900.htm


Yes, it's a bit frustrating, especially for headers with inline/macro code. And for headers, requiring C23 doesn't seem sensible for quite some time. I define a macro:

    #ifdef __cplusplus
    # define ZERO_INIT {}
    #else
    # define ZERO_INIT {0}
    #endif
Works for arrays, aggregates, scalars, etc., but I just use it for arrays and aggregates: `char buf[32] = ZERO_INIT; struct X x = ZERO_INIT;`


> Occam's razor says that tooling, or developers, just aren't great at shaking unreferenced content from production builds.

The Cutting Room Floor[0] is an empirical proof of this assertion.

[0] https://tcrf.net/The_Cutting_Room_Floor


And the corollary, "In theory, theory and practice are the same thing. In practice..."


Are those mutually exclusive? Modern society can be a systemic trainwreck and still be better than what came before.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: