Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Egui: An easy-to-use immediate mode GUI (github.com/emilk)
366 points by ducktective on Aug 13, 2021 | hide | past | favorite | 141 comments


I investigated egui early this year. It's a very good library! For me, getting used to immediate mode UI was a change from being used to DOM / React-style UI. imgui works well for a lot but it struggles with complex layouts, eg flexbox. This is acknowledged in egui's README

Couple things I built with egui: https://artifacts.bypaulshen.com/rust-gui/code-render/ https://artifacts.bypaulshen.com/rust-gui/tictactoe/


Could you possible share the source to your two examples? I'd love to see how much is accidental complexity vs actual complexity, compared to how web UIs (and other similar paradigms) can be built almost in an almost fully declarative way.


Part of what makes the DOM slower by comparison is the fact that it supports such advanced, declarative, adaptive layouts. There's a tradeoff for having that vs not having that


Here's the main file of the tictactoe implementation.

https://gist.github.com/paulshen/b04db68c754f73e693cd29a481c...

`update` is similar to React's render function except here it's being called on every interaction (eg mousemove tick) and redrawing the entire screen from scratch! The code runs so fast that this is fine from a perf standpoint. Note that egui has a reactive or continuous mode. You can rerender everything from scratch 60fps (continuous mode) and it'd be fine in most cases.


Tic-tac-toe is frustrating me because I can't seem to force it into a draw.


I tried to make it win every time and I'm wondering whether given a Tic-tac-toe opponent that does random moves, it is always possible to play such that your opponent always wins.


No such strategy could exist because if it did, if your opponent randomly implemented the same strategy then it would force itself to lose


Good point! I guess the difference is whether it exists for the player that does the second move.


The tic tac toe is easily defeated


Did you read the text at the bottom of the page?

Don't try too hard. The "AI" makes random moves.


Be sure to checkout the web demo, quite impressive IMO:

https://emilk.github.io/egui/index.html


Why does every IM UI library have a dropdown widget that they call "Combo box"? Dear IMGUI, Nuklear, this one does it too.

A dropdown list is not a combo box - a combo box is a text input box combined with a dropdown list. This "combo" aspect is missing in all these frameworks, but they still call their simple dropdowns "Combo box".


Because in the 'olden days' (Win 3.0 era) there was only a combobox, not a separate dropdown. Over the years the names have stuck.


FWIW, Windows screen readers also refer to plain drop-downs as combo boxes. I didn't think anything of it until I read this thread.


I think what happened is that a combobox and dropdown-list merged, because basically the combobox is an editable dropdown list. Qt's QComboBox definitely combines them. Apple does not (NSComboBox is editable, NSPopupButton appears to not be). Not sure what Windows or GTK do.


In some widely influential toolkits, like Win32[0], this is just a part of the style state and even comboboxes without text fields are called comboboxes.

So the name stuck. It isn't really much different than calling them "radio buttons" even though they have nothing to do with radios.

[0] https://docs.microsoft.com/en-us/windows/win32/controls/abou...


> It isn't really much different than calling them "radio buttons" even though they have nothing to do with radios.

The button set on the face of many dashboard radios would physically pop out the previously pressed button when you pushed a different one in so only one in the set would be selected. The name is pretty obvious if you've used the physical version but I haven't seen one since the 90s.


Yes, i know this backstory which is why i mentioned radio buttons: they called like that because of some historical reference, not because they have anything to do with radios.

Similarly comboboxes are called like that because of historical reference (to the combobox control on Win32 that provided the functionality) even though they aren't really a combination of input box and list.

(though one could claim that they are still a combination of a button and a drop-down list, so the name still has some relevance to what is displayed, unlike radio buttons)


> … not because they have anything to do with radios.

They work like radio buttons. How’s that having ‘nothing’ to do with radios?


I use JUCE a lot and they have use the ComboBox name as well. Always reminds me of creating custom levels in THPS...


the juce combobox uses a texteditor widget, so it actually deserves that name.




> Why does every IM UI library have a dropdown widget that they call "Combo box"?

They can call it whatever they want?

> A dropdown list is not a combo box

Who says?


Parent has been down voted to oblivion but they make a good point - arguments over words and names seldom produce any interesting discussion or definitive conclusions.


https://docs.microsoft.com/en-us/windows/win32/controls/comb...

The Win32 API is going to continue to drive naming conventions when all of us are dead.

But yeah, the difference between a combobox, a listbox, and a dropdownlist is just a flag value.


There's actually a separate ListBox control. And I never thought of it at the time, but why does an option turn a ComboBox into a DropDownComboBox, but an option doesn't turn a ListBox into a DropDownListBox? Oh no, it's all coming back to me, because ComboBoxes were added later and rather than change ListBox they added DropDownList functionality to ComboBox.

I also don't see an option in ListBox to make it multi-select, was it all horribly done in the notification handling code?


The font rendering is almost unreadable on my machine, but the functionality seems to be there


On windows 10 here, the font rendering was also what put me off immediately in the demo.


It seems like the font rendering doesn't do any antialiasing (other things are antialised, you can toggle it in backend/settings). The result is pretty painful to look at small sizes, unless you compensate with screen resolution (as apple now does).

Zooming the page to 130% gives me a quite acceptable experience.


Same - Fedora 34, nvidia driver. Excellent performance though.


> NOTE: The WebGL backend does NOT pass the color test.

> This is because WebGL does not support linear framebuffer blending (not even WebGL 2!).

> Maybe when WebGL3 becomes mainstream in 2030 the web can finally get colors right?

Yes, I had to transcribe this manually because I couldn't copy and paste it.


This is quite standard desktop gui behaviour.


Definitely, but I think part of the underlying criticism here is that standard desktop GUI behavior on text selection and copy-paste is bad, and we're kind of questioning whether it's good to encourage the web to continue moving in that direction.


This is way snappier on iOS than I would have expected. Feels faster than most web apps.


That official egui demo is phenomenal.


I can't seem to select any text on my phone.


Text in labels and other controls is not selectable (unfortunately), in this regard egui behaves as a native app, so mostly just editable texts are selectable. This feels a bit unnatural on the web, but one must consider that egui uses a custom WebGL renderes, so the content is not backed by standard DOM.


On native apps with Qt text on any widget can be made selectable via a property.


The text in the textbox is selectable, so it's not like egui can't handle text selection. It's just following convention with making GUI text unselectable.


Non-copyable texts by default in "native" applications sounds reasonable but is actually one of those things that introduces unnecessary friction down the line when users try to write tutorials or reports and have to manually type out everything.


Yeah, I would like to see that behavior as an option, or even better allow to select the whole UI as a block of text (to feel natural and be searchable on the web). I belive any UI (incl. native) should not prevent you from copying any text.


Yeah, it's not using the dom, it's using a canvas with webgl.


As a user of a keyboard driven browser, I hate it. I hope this trend of rendering the whole page on html canvas never catches on.


I use this in a few of my projects. Simple, no complaints. Compile time is faster than it takes for me to yawn most of the time (using mold linker and sccache, which improves compile time a lot with a far superior linker and caching compiled dependencies, respectively).

(Edit to add: below complaint is entirely within rust and irrelevant to this library)

Only problem (not the fault of egui) is sometimes the elements disallow me using the templateapp object mutably as a whole: sometimes I have to mut borrow a part of `self` before the ui part (instead of mutating `self.whatever` within the UI element, I have to do `let borrow = self.whatever.borrow_mut()` before the UI element then call that borrow) which seemed odd to me as a beginner. I think this is mostly a rust "problem" but also maybe because of how UI element creation was designed.

If you understand the borrow rules then you'll probably understand it fine. I might be able to just lock an Arc Mutex instead and avoid the problem but I just learned how to use them so :p


The awkwardness is because use of `self.whatever` in closures borrows `self` as a whole instead of just the `whatever` part. The good news is that Rust is going to fix this soon:

https://blog.rust-lang.org/2021/05/11/edition-2021.html#disj...


Oh thank god. This has been so annoying in the past


Thank you so much! that's very interesting.


If the problem is as pornel described, you can work around it by creating the borrow of self.whatever outside the closure.

Instead of:

    foo(|| {
        bar(&self.whatever);
    });
... do:

    let whatever = &self.whatever;
    foo(|| {
        bar(whatever);
    });
To avoid leaking `whatever` into the scope, there's an idiom that looks like this:

    foo({
        let whatever = &self.whatever;
        || {
            bar(whatever);
        }
    });


Yeah that's what I've had to do, it just seemed like an oversight tbh. Maybe the idea is to not keep the entire GUI state in a single object though, but it's the one time learning rust that I thought "hey that doesn't make sense". which really speaks to Rust's strengths, rather than its weaknesses. Can't wait for 2021


Reminds of highly interactive older Windows programs, like Photoshop. Every slight mouse movement gives you some sort of feedback, every click makes a change, everything is snappy... I like those things and I'm sad the web took them away. On the web everything is dull and dead, you can tell it very much was designed for reading.


How is it for accessibility?

Looking at the custom widget example didn't leave me with the impression that it has that in mind, but I could be mistaken.


Blind screen reader user and game developer here.

As of a recent release, egui pushes out some events that I'm able to consume and provide TTS feedback for. You can find a simple Bevy example with a bit of event coverage here:

https://github.com/ndarilek/bevy_egui_a11y

In my game, I'm able to use this for buttons, checkboxes, and single-line text edit fields. Sliders are a bit wonky but I'm hoping to eventually get those working better as well. Much of this development is needs-driven for me, and as of now I only need these few widgets. I certainly wouldn't complain if coverage improved, though. :)

Before anyone jumps on me about screen readers not being the accessibility end game, I do know that. But as a totally blind developer myself, I can't easily work on solutions for low vision, high contrast, etc. And for my needs (simple keyboard navigable game UIs for games targeted at blind players) it works well enough, and should hopefully get better soon.

For true cross-platform accessibility with integrations into native APIs, we'll need something like https://accesskit.dev.


> my needs (simple keyboard navigable game UIs for games targeted at blind players)

Huh... I am surprised that, with that target, you are even using a UI library--and then dealing with its resulting level of accessibility--rather than building something more akin to an IVR tree out of direct usage of text-to-speech and keyboard APIs.


Because I want it to look good (or at least OK) and work on mobile.


Interesting, thank-you!

I hadn't heard of access kit; I'll dig through that today.


AccessKit developer here. It's still really early in development; there's not much there yet.


> Each node has an integer ID, a role (e.g. button or window), and a variety of optional attributes. The schema also defines actions that can be requested by assistive technologies, such as moving the keyboard focus, invoking a button, or selecting text.

This sounds very similar to what I'm using for my Semantic UI project (which has similar aims).

Accessibility systems require the ability to programmatically interact with the UI, too (install Accerciser if you're on an AT-SPI2-based system to have a play around); I'm not sure how your system supports typing. (Is it all done via Action::ReplaceSelectedText?)

Also, have you thought about latency? AT-SPI2 is really laggy (“bring down your system for several seconds at a time” levels of laggy), and from a cursory inspection AccessKit looks even heavier.


I'd like to know more about the Semantic UI project.

The way text input is implemented depends on the user's platform and input needs. When using a screen reader with a hardware keyboard, the screen reader will often use the accessibility API to programmatically move the keyboard focus, but once the focus is in a text input control, the input itself happens as usual, not through the platform's accessibility API. For users who require alternate input methods such as speech recognition, it depends on the platform. On Windows, for instance, text input isn't even done through the accessibility API; it's done through a separate API called Text Services Framework. But AccessKit will offer the ReplaceSelectedText action for platforms that can expose it.

I have certainly thought about latency; as a Windows screen reader developer, it has been a difficult problem for a long time. The relevant factor here is not the amount of information being pushed, but the number of round trips between the assistive technology (e.g. screen reader) and the application. If I'm not mistaken, this is what makes AT-SPI problematic in this area. This has also been a problem for the Windows UI Automation API, and a major focus of my time on the Windows accessibility team at Microsoft was to help solve that problem. As for AccessKit, I'll refer you to the part in the README about how applications will push tree updates to platform adapters. Since a large tree update can be pushed all at once, AccessKit doesn't make the problem of multiple round trips any worse.


> The relevant factor here is not the amount of information being pushed, but the number of round trips between the assistive technology (e.g. screen reader) and the application. If I'm not mistaken, this is what makes AT-SPI problematic in this area.

That explains a lot! AT-SPI2 has, as you say, a lot of round trips – and some applications (e.g. Firefox) seem to use a blocking D-Bus interface that means they drop X events while talking to the accessibility bus.

> I'd like to know more about the Semantic UI project.

I don't think it qualifies for a definite article just yet. :-) I got annoyed with the lack of good, lightweight, cross-platform GUIs in Rust, and I tried to make my own, but then faced the same issue with accessibility APIs… so now I'm trying to solve both problems at once: defining a schema and interaction protocol for the semantics of a user interface, as a first-class citizen – all the information needed to construct a GUI interface would be present in the “accessibility data”, but in principle any kind of UI could be generated just as easily. (Of course, a GUI auto-generated from the SUI data would be like a CSS-free webpage; I'm planning to make a proper GUI library too, later.)

There are three types of thing in the schema I've got so far:

• “Widget type” – basically a role. Each widget has exactly one widget type, which implies a certain set of features (e.g. section-with-heading has a heading)

• “Feature” – a group of attributes with a semantic meaning (e.g. the heading feature consists of a reference to the heading widget (which must have the feature providing its natural language representation)). I'm not sure how to deal with stuff like “can be scrolled”, because I still haven't finished bikeshedding things like “should there be implied zero-size features, or should widget types just have a lot of semantics, or should there be a load of explicit-but-redundant mandatory features on every button widget saying it can be pressed?”. (I'm leaning towards the latter, now, under the assumption that simplicity is better than trying to reduce bandwidth.)

• “Event”. Every change to the state of widgets is accompanied by an event. There are separate events for semantically different things even if the same thing happened; for instance, when LibreOffice Calc deletes table cell widgets that have gone off-screen, the widgets have been deleted but the actual cells are still there; that's a different thing to what happens when someone deletes a worksheet, so it should have a different event. This makes SUI retained-mode, but it should be usable with immediate-mode UIs in the same situations as AccessKit is.

I haven't worked out how to represent “alternate interface interacts with program” yet, but I'm leaning towards a second kind of event, with the set of valid user events (and hence what the alternate UI “looks” like) determined by the

Another question is how to represent cursors. Obviously there should be co-ordinate-positional (mouse-like) and cursors over the widget graph, but keyboard-driven GUIs don't behave like either of those things… so do I just let the alternate interface deal with cursors? But then how does the application know what's currently selected by the cursor? (Focus, hover, select… all with different semantics, not all of which I'm aware of.) Maybe SUI should just completely keep out of that, and pass through a cursor ID and various events without trying to fit them to a model?

You can tell I'm not very good at this; if I'd heard of AccessKit earlier than a week into this project, I wouldn't've started it! :-p

Since pretty much every OS supports Unix Domain Sockets, I intended to use that as the communication protocol. Backends for existing accessibility systems (e.g. AT-SPI2, IAccessible2) were planned as daemons, but to be honest I don't know enough about IPC to have planned this out properly, and I haven't really got attached to any one architecture. I don't even know that that would work properly; IAccessible is a COM interface, and afaik Windows cares strongly about which process provides those.

I thought amount of information was a factor in IPC latency (even though computers can download GBs of data over a network in seconds), so I've been distracting myself with trying to “lazy-load” lots of the data. If you're right about latency – which you probably are – then that's worse than useless, and I should just bubble that up.

A final question is: how to deal with reading order? I have no answers to this.


The WebGL demo has a checkbox for (Experimental) screen reader support but I don't have any speakers on this computer to test.


It basically tells you what you clicked on, so for example "Widget Gallery: checked checkbox" or "Click me: button" or "width: slider, 250".

It also reads these things out when you tab through the interface (which mostly works). Obviously a bit rough around the edges, but I've seen far worse.


Yeah, I believe accessibility is not addressed at all. Similarly when compiler for web, it would be great if the UI would actually behave as text (actual web page) and could be copied into the clipboard etc.


Why is everyone concern with accessability on small gui tool? Someone make a cool project and always hackernews with the question on how it serve this very small demographic.


I think I get what you mean: the percentage of people who access the web using a screenreader is not huge. That doesn't mean we should discount it, it's just common courtesy and it's a legal requirement in many circumstances.

But if you take one step back, the demographic is not very small at all! It includes everyone who is a bit older and maybe can't see as well as when they were 23, so they like to increase the contrast a bit. I myself am not very far above 23 and I read hackernews zoomed in to 130%, it's just more readable. Many people have an easier time using a computer by for example making buttons larger so that a mouse can click them faster. There are more examples.

And last but not least: Writing functional UI tests for applications is enabled, in a good chunk by accessibility capabilites built into the gui toolkit. My job would be much harder if that wasn't baked in to most UIs.

So, while I agree with you: let people have fun playing with technology and create new things without having to implement everything from the beginning, I don't think it is fair to just dismiss accessibility concerns like that.


A large percentage of users benefit from good accessibility features. These don't just include screen readers, but things like being able to change colors and font sizes (manually, or to pre-sets designed to help people with deteriorating vision or colorblindness, et c.)

Most people's vision and hearing go to hell at some point. Practically all older people can benefit from a11y features—whether they know they're there, and know how to enable them, is another matter.


Hell, I use display scaling to make everything large on my TV and everything tiny on my laptop.


Folks ask because everybody really really really wishes there was a GUI tool that made accessibility concerns easy. And so when they ask, they are asking: Is this finally "The One"??

And why do folks care about accessibility so much? Because it instantly makes everybody's life easier, and prevents you from needing to reinvent the wheel for a whole host of features.

I don't have any disabilities, but I love it when accessibility is done right anyway. Examples:

- I browse the web at 175%

- I prefer keyboard navigation whenever possible

- I like knowing what something is about to do before it does it, even if I'm not hovering my mouse (see "keyboard navigation", or "touch screen input")

- My connection is often unreliable, and I like knowing what images are supposed to be and, for that matter, being able to simply read text content (which some websites manage to break)

- tools with good accessibility, whether on the web or native, are almost always easier for me to write automations for, because they "follow the rules" and thus have reliable hooks for automation

Those are just a few of the reasons as a person who doesn't need accessibility tools, that I am always eager to know how well a GUI tool handles accessibility questions.


> - I browse the web at 175%

are you sure you don't have any disabilities? (j/k :)


We're asking because that is often a legal requirement for production use.


Because many of us make use of accessibility options, and so it's relevant to whether or not a project is of interest to us.


Well, it's a somewhat important and key feature for these kinds of projects. It's not much different than asking "does it support checkboxes" or "how does it handle scaling?"


Accessibility is probably a good way to discern is gui usable for a particular use where one needs accessibility. It's an interesting feature to have for many. I don't ever read it as a way to disparage the gui (guis after all are quite hard to implement - any experimentation no matter how playfull is probably more than welcome in this space).


The point of accessibility is not minorities, but taking all users in various circumstances into account. Sometimes you may yourself be on a device where images do not show and you need alt text, or where you need to resize fonts to be able to read text.


Totally, I am a dyslexic and frequent user of my systems text-to-speech vocaliser as well as a big proponent of immediate mode graphics. Every single time this topic comes up people who often neither use accessibility features nor program immediate mode graphics make this point as a cheap put down. Sure this is a downside of the paradigm, just like a hovering dropdown menus is hard to always do right in immediate mode layouts. In both cases there are ways this can be mitigated. This point has been made so many times that I got my cheap comeback ready: Here is a blog post by the web agency of the W3C giggle explaining why they had to create there own CMS system because existing solutions don't have good support for accessibility giggle: https://w3c.studio24.net/updates/on-not-choosing-wordpress/


Looks pretty nice apart from the font rendering, which is terrible. It looks half decent with the default color scheme but when you switch to the dark mode it really shows.

I'm using a regular 1920x1080 27" display for reference

EDIT: the unhinted editor font looks good.


Oddly enough it's the inverse for me: I can hardly read the light theme one.


I was playing last week with this framework.

One of the best things that Egui does is updating the layout using a reactive rendering strategy by default. So rather than doing a paint per frame (each 15ms), it does it lazily only when the state changes.

For example: if you have a plot that updates each 10ms... a UI paint will be triggered each 16ms (up to 60fps I believe). If not, the UI will be updated lazily (when you move the mouse or other event happens).

Note: you can choose a "continuous" rendering mode for the backend as well.

It also has great accessibility with a built-in screen reader. It's an impressive piece of work.


This is an old UI redraw strategy. It’s been used since the dawn of GUIs and is referred to as invalidation.

Here’s an example of a standard GUI invalidation event: https://github.com/Planimeter/grid-sdk/blob/master/engine/cl...

As you mentioned, most GUI software invalidates primarily on mouseovers, and individual components handle things like invalidation on focus.


I see many complaints about the font rendering and I also agree the fonts don’t look good. This is a general problem with OpenGL. OpenGL draws triangles. There is no support for antialiased text rendering. And if one can figure out how to render live text with OpenGL, it won’t be something a GPU is efficient at rendering. I think the normal approach is to pre-render all needed glyphs and then move them into position. Pathfinder, Slug, and the work from Raph Levien are the only projects I know working on GPU based font rendering.


Something interesting to think about:

- This has a framework "eframe" which lets you write generic apps which can be rendered and run either natively on on the web with WASM:

  - https://github.com/emilk/egui/tree/master/eframe
This has me thinking -- there's an existing standard "Web Audio Modules (WAM)" which never really went anywhere:

https://www.webaudiomodules.org/

I'm imagining some way of bridging the VST3 API's and the WAM API's, so that you could write code using "eframe" that would run either as a natively compiled VST plugin, or in a browser and using the DOM audio API's as a Web Audio Module.

               native -- glium (OpenGL) --                              --- VST3/LV2 API "Adapters"
              /                           \                            /
    eframe ---                             Generic Audio/Synth/FX API's
              \                           /                            \                                                   
               web ----- WASM (WebGL)-----                              --- Web Audio Modules API "Adapter"
Wouldn't something like that just be really friggin cool?

I'm not big-brain enough to figure out the details or if it's even viable, but damn it would be revolutionary.


Figuring this stuff out is mostly grunt work, see if you can get a minimally viable prototype going, then maybe you can attract some big brains to refactor and improve and make it into something.


I just couldn't resist posting this here from one of the files comment section:

iOS-style toggle switch:

    _____________
   /       /.....\
  |       |.......|
   \_______\_____/


Not sure if that would look better:

    _____________
   /       /.....\
  (       (.......)
   \_______\_____/


I'm using it for some of my little tools, and it's pretty seamless going between native and web, which I appreciate a lot, and unlike other non-native UI frameworks, navigating through egui feels pretty natural (the way input are handled etc).


I've always enjoyed making little apps with Immediate mode GUIs. It's a shame that they aren't as effective as retained GUIs for more complex apps.

When I was still programming in Rust FLTK was the only GUI I used that both worked on my Raspberry Pi 4 and was something I felt like I could grasp. I'm not sure if that is retained or immediate or not. I wish the OS maintainers would all get their act together and create an api that made creating cross platform GUIs easier. You can't blame people for using Electron when you see the state of desktop GUIs

1. https://github.com/fltk/fltk


FLTK is retained.

It’s, from my experience, the sanest of all toolkits - much easier and simpler to use than almost any other, for 98% of tasks, and usually faster. Back in 1998(!) when I started using it, it was also the only flicker free one for Windows and X (without very significant effort) although I believe that gap was finally closed circa 2008 or so.


Why aren't they good for more complex apps?


Layout becomes challenging after a certain point with ImGUI. Especially if your layout "flows" based on the state of existing elements. This can require multiple draw passes in ImGUI to get certain effects and outcomes.


The plot widget is pretty nice. Makes me wonder if it could be an interesting frontend for viewing data in Python, like a realtime-capable, GPU-driven, web-friendly matplotlib.


egui (えぐい) is Japanese for "harsh (taste, feeling, etc.); acrid; pungent; astringent "


It's also a slang term meaning "awesome; amazing; incredible; cool".

https://ejje.weblio.jp/content/えぐい


One thing I wish every immediate mode GUI library would show is how it handles complex script like Thai, Arabic, etc. because it's quite hard to get it correct, but it's also the hard requirement if your application need to support those language.


I haven't looked at the usage pattern/implementation but the visuals are much nicer to the eyes than ImGui[1].

1: https://github.com/ocornut/imgui


I took a look at the GitHub repo and I understand this is a GUI library for Rust, but I don't quite understand what they mean by "immediate mode" - could someone elaborate on that, please? Thank you!


A section in the GitHub README tries to answer this question: https://github.com/emilk/egui#why-immediate-mode.

I also read an old 2005 blog post & video by Casey Muratori[1] that explains the differences in more detail.

[1] https://caseymuratori.com/blog_0001


Immediate mode guis, afaik started in games. The GUI is written on every frame of the game, layout and state (i.e. button presses) are handled in code.


No, it started in the CS laboratory. The first GUIs were immediate, then this concept was quickly abandoned because of the numerous short-comings. Where GUIs mattered, the retained mode ruled supreme.


Til. thanks


This looks like a good excuse / starting point to learn Rust, but available documentation is basically tutorials that are hard to follow without knowing Rust. (I did work through Rustlings but mostly forgot it).

Of course, it's unfair to ask of any library to provide an intro that also presents the surrounding language. But Rusteros are always so eager to gain converts so... hey, here's a tip.


I've been looking for an immediate mode GUI library for LibGDX, I might have to write a port of this. I'm struggled many times to get ImGUI working nicely with LibGDX but it's never worked for me.


The project depends on `libspeechd-dev`/`speech-dispatcher-devel`, but there's no mention of support for voice commands. Does the framework support speech recognition? That could be pretty neat.


You are confused because you haven't bothered to read the description of these libraries. They are for synthesis, not recognition.


Sounds like the screen reader dependencies


How about Wayland ? Or does it need Xwayland?


Is this Rust only or is there bindings for C?


It takes advantage of Rust's closures for convenience, generics to avoid pointer indirections and heap allocations, and Drop for automatic memory management.

The API isn't complex, so it may be possible to export a C-compatible wrapper, but due to lack of the aforementioned language features it may be less optimized and less ergonomic.


I have been enjoying it so far as I explore rust. it's quick and the webasm is a big winner!


no error checking = browser with no webassembly fails silently with no error messages


Maybe I'm misunderstanding what you're trying to say, but I don't see how that's a criticism of the library. You can't really do WebAssembly feature detection from within a Rust library compiled to WebAssembly; that's the responsibility of the JS loader you use, not your UI library.


On iPhone Safari, I can’t change the Combobox value using the Combobox widget.


I've had this fight a dozen times and there's nothing really new for me to add, but I consider GUI frameworks that target WebGL and Canvas for their basic elements like text and sliders to be usually harmful, for multiple reasons[0].

That being said, some (imo high) praise:

- Demo is faster than most other WebGL/Canvas-backed frameworks that I've looked at in the past, including Flutter and Qt. It easily blows them out of the water.

- Demo works in Firefox

- Demo respects platform scroll direction(!!!)

- Demo supports platform-specific cut/paste(!!!)

- Demo responds to (some) browser zoom events(!!!)

- Demo is at least semi-responsive (again, compared to Qt/Flutter this is miles ahead)

- Demo handles touch events

- A ton of impressive attention to detail around UX affordances like showing link locations and tooltips on hover. Clearly focused on actual usability, not just on showing off tech.

- Just again stressing that interactions are really snappy and performant. Kind of starts to chug if I open a ton of windows, but... I'm comparing this to the last Flutter demo I saw on HN; the performance is great.

----

And some criticism:

- Evergreen accessibility concerns (I've seen some conversations about exposing a separate API, this... misses some of the point of how accessibility on the web is designed to work imo. Also, the DOM is an accessibility layer. I don't understand why we keep reinventing the wheel here, I don't understand why we're recreating content trees to send to screen readers when the browser has a content tree baked in)

- Search doesn't work at all

- No right click menus anywhere

- Font rendering appears to be heavily platform/DPI dependent (?), on some of my screens the whole app just looks blurry, I would abandon any app in the real world that had fonts that looked this bad on some of my screens.

- Copy/paste/text-selection disabled in most of the demo (for example, no way to copy links)

- Good on the demo for catching platform settings for scrolling, but it doesn't catch any of my other custom keyboard shortcuts or gestures. Even some stuff that's not platform-specific: I'm not sure why page down/up isn't getting caught.

- Demo blocks some mouse gestures that should filter up to the browser level (pinch to zoom, etc)

- Demo has no fallback if WebGL isn't enabled, which makes this solution unworkable for anyone configuring their browser to avoid fingerprinting (hopefully Firefox will make this concern go away in the future by bundling WebGL permissions into Canvas permissions. This is not really the author's problem to solve, I don't think).

- Evergreen criticism that demo does not respect my font choices or work with any of my browser extensions. This is the important thing, because I still don't see how (short of re-implementing the entire DOM) a GUI framework like this is ever going to respect my browser settings and extensions. You can polish a GUI framework for eternity, that doesn't mean 3rd-party extensions are going to start working with it. This is a huge step backwards for the web in my opinion, I regularly customize and extend the sites I visit. There's no way for a developer to allow that using a framework like this.

----

TLDR I think this is easily the best Canvas-backed GUI demo I've seen so far, the author(s) should be very proud of it. I can't speak to its native performance but I'm seriously impressed by how the demo performs in the browser. Also seriously impressed by some of the little UI touches I'm seeing. I can't stress this enough, good job.

However, this is still not the right way to build a cross-platform GUI that targets the web. The demo still has serious flaws that aren't the result of a lack of polish, they're the result of the rendering target itself. I still heavily advocate that people trying to build GUIs for the web use something else (even for the stated use case of a simple GUI). There are multiple cross-platform GUIs that will spit out actual DOM content. A polished cross-platform GUI that spits out HTML instead of pixels as its render target won't have the problems that I listed above.

I understand that working with HTML is frustrating, particularly in some native circles where the way developers think about UIs is primarily visually. But the DOM is not a framework for displaying visual graphics, the DOM is a render target. A lot of benefits on the web stem from the fact that we treat the DOM as its own render target. That philosophy is why a lot of extensions and behaviors on the web are possible. No rule is absolute, there are instances where WebGL is the right choice. But for most frameworks, particularly frameworks that are displaying text data, it's not.

This focus on taking the DOM out of the picture to get rid of the messy details of how it works is a bit like arguing that the best way to improve command line interfaces is to abandon text and start streaming OpenGL buffers to the terminal output. It's missing the point. Yes, working with the DOM is frustrating, but a lot of the frustrating parts are a result of the medium itself. The DOM is part of the user interface.

----

[0]: https://news.ycombinator.com/item?id=27132087


> No Right click menus anywhere Just wanted to add that I think that's going to be added soon, I see there's a pull request[0] to get those in. Seems like development on this is pretty active

----

[0]: https://github.com/emilk/egui/pull/543


Amazing review. Thanks for posting.

Since you are experienced in GUI tech, what cross-platform solution do you think "has got it right"? I like QT apps alot for snappiness alone.

What do you think of https://quasar.dev/ and https://ionicframework.com/ ?


Right to start, going to be clear that I would not consider myself to be particularly experienced in GUI tech. I have a checklist of things that I look for whenever a "run native code in the web" project gets posted, but I haven't downloaded and checked the native performance of any of these technologies. I'm really just looking at their web demos.

That being said, on the web, Blazor is what I'm paying the most attention to right now. Being in C# might be a pro or a con for you. But Blazor will spit out an actual DOM tree, it seems to have a lot of promise from what I can see. There's a heavy emphasis on writing UI in C#, but having good interop with browser technologies and JS libraries.

On the Rust side of things, pay attention to https://www.areweguiyet.com/, they've been doing a great job of cataloging different efforts. A lot of the Rust efforts have either gone in the direction of "embed a Webrender instance in a native app" or "embed a native app in the browser." I'm not sure that's a great direction, at some point I think it might make more sense to treat the HTML as a compile target rather than an authoring language.

Again, I point towards Blazor. Not the C# part obviously, keep it in Rust. But I wish some of the Rust GUIs would take Blazor's approach and spit out separate HTML for the web (even if the results look slightly different) rather than trying to embed the same renderer everywhere.

I don't personally use Qt, but that is purely a personal preference, if you're doing native cross-platform apps, I don't think there's anything wrong with Qt. It's popular for a good reason. However, I do think that Qt's web export is really bad. It's making the same mistakes as Egui, but Egui has the benefit of clearly a large amount of work devoted to performance, UX, etc... not to say that Qt's web export hasn't had a lot of work put into it, but Equi's seems to have actually paid off, and the Qt demos don't give me that impression.

That's purely on the web though, as a user for native cross platform code I don't really have many complaints about Qt, if you like it you should keep using it.

Ionic and Quasar are obviously going to have great results on the web because they're based on React and Vue, which are web technologies -- so of course they're going to render to DOM in those situations. I'm not sure how their native performance will be. My immediate concern with both of them is that they're trying to upsell me on a license. You might also take a look at React Native if those libraries seem attractive to you. I don't know what the consensus is on whether React Native runs well on mobile devices.

And again going to stress here, I'm opinionated but I'm not an expert. I'm not really commenting on the ease of use or the architecture paradigms or language choices or anything here, I'm just looking the web demos online that these technologies have and checking to see if they run OK inside a browser.


Thanks for the reply.


What an amazingly detailed write up, thanks for sharing. For web libraries or frameworks, what are your preferred options and why?


> For web libraries or frameworks

Sort of a broad question, I think that depends a lot on what you're trying to build, what programming styles are most natural to you, what dependency requirements you care about, etc...

I'm not sure I can pick out one or two JS frameworks for web interfaces and say they'll be the clear right choice for everyone.



Very slow to compile, on top of being over engineered with dependencies

Also something seems very weird with the way they render their fonts, it's very blurry (could be the font+size that is the issue)

nuklear or imgui are much better at doing immediate mode GUIs without dependencies


Needs copy and paste?


I get why people like immediate mode GUIs but the performance tends to be pretty bad.


Bad is relative. If it's a game where you have to repaint the entire screen anyway, then it doesn't matter. But a lot of potential optimizations are lost when you don't use a retained scenegraph.


Citation needed? I have a not too complex but definitely non-trivial editor I made for a little toy game engine and the Dear ImGUI code and related UI logic runs in under 1/10th of a millisecond per frame. My past Qt applications performed worse than that. My single page web applications often perform worse than that

But like the other reply says, bad is relative.


> Dear ImGUI code and related UI logic runs in under 1/10th of a millisecond per frame. My past Qt applications performed worse than that.

are you using freetype font rendering with dear imgui ? because otherwise it's not a fair comparison - just getting a "proper" font rendering (and not ImGUI's default, "okay-for-an-internal-app" font rendering) is a super large computational cost ; last time I had to profile that for my Qt app, just proper font rendering if you have a few thousand characters shown will easily be more than 1/10th of millisecond without counting anything else on a decent computer.


Immediate mode GUIs are not designed for production.

As soon as you want to do compositing, you can’t, because compositing is done with retained structures.

Immediate mode GUIs already lie to you by retaining some data then flushing it later.


It's not a real immediate mode GUI unless it starts drawing pixels instantly? That's not even how real immediate mode graphics APIs worked.


I don't know what your qualm is. I just said that.


Well why do you think it's important (important enough to call immediate mode a lie, even)? Immediate vs. deferred is in my view entirely about the API design, it does not really matter whether an immediate-mode GUI framework is appending to a drawlist when you call into a widget's code or is writing some pixels into some buffer; the point is that it does not have retained state about individual widgets (ImGUI doesn't, don't know about egui).

As you say, compositing on a widget level is much, much harder, but even traditional retained frameworks generally only support compositing on a top-level window basis (e.g. partially translucent windows, translucent menus or drop-downs) because as we are seeing (and you point out in another post), compositing between essentially arbitrary stuff is heinously complex to implement, let alone implement with decent performance.

I'd be interested to hear what use cases you have in mind for intrawindow compositing.


Yeah, I mean, I get what you're saying about the API design. That's probably more important to people because it's fun or novel, but it basically does have retained state about individual widgets. Not literally every individual widget, but it needs to store something, or it wouldn't know what you hovered over, or were dragging and dropping, so it does this with IDs. That's state, and it's retained. That is, it's not flushed after every frame. There's a global state used to track this, but it's just hidden from you. The IDs are probably generated based on call order, since the design requires you to "check" by drawing.

Like you said though, even immediate mode rendering in reality worked a bit differently than people really understood. Today we're not actually performing a DMA transfer with every single call.

You're generally right about how compositors do effects layering, since you have to continually reference existing texture data to composite layers without limits, though as far as I know, web browsers do this a bit differently, and I'm not sure what the limits are with the major implementations, or if they exist.


I mean, I’m looking at this on mobile and desktop and just like a another poster mentioned, in relative terms, the damn thing is smoother than most web apps.

What’s the end goal here, right? The web sucks at the moment and I’m open to all ideas.


Smoother yes, but to be fair how much CPU and power is it using? That's invisible to you right now so appearances can be deceptive.


As soon as you want to do anything remotely as complex as the web does, like say arbitrary UI element sizes with scrollable overflow, you immediately have to do the same thing CSS 2.1 rasterizer engines do.


I think you are just dead wrong about this. I’ve played video games my whole life and they always always had more complicated UIs, akin to web apps. It’s always been a richer experience.


I am the author of a CSS 2.1 rasterizer, as well as the author of GUI software for game engines. They have different ways they draw user-interfaces.

CSS 2.1 and the associated modules and extensions to it that you would use today to build UIs are much more complicated to implement than game engine UI.

If the opposite were true, you'd see more web browsers that weren't KHTML derivatives, but that's not the case.


I am the author of a CSS 2.1 rasterizer

Yikes, checkmate.

I guess, the frustration expressed has more to do with applications shifting to the browser. However we are rendering now is just not acceptable for web apps. My gut said ‘fuck it, do what games do’.

We need something different, and I’d be happy to hear your thoughts.


I agree with you. I think the real issue is that web developers don't have any ability to control the compositor besides hinting it to move portions of documents into explicit backing layers. You do this today with CSS transform operations, because those operations are done in shaders today.

That being said, I don't know if web developers should have that type of access, either.

The slowness you experience is based primarily on not redrawing something once, but several times, as modern web browsers need to break down very large web pages into tiles, then repeat draw instructions per tile where intersections exist. If I recall correctly, WebKit actually has an option to let you see the tile boundaries and when layers invalidate. This is my best understanding of the architecture KHTML derivatives use today. Compositing is very slow. And layout algorithms are very, very slow.

So you can either do some fun stuff that I think imgui does, like sending VBOs over which is very performant, because you can draw lots of elements but you can't at all do things that people care about like compositing (think blurs behind panels, or layer opacity, etc.), or you can composite layers, but this becomes very expensive and slow to do.

If you simply said, hey redraw this portion of a web page into a single framebuffer, you'd cut down on draw time a little bit, I'm sure, but because you can't allocate framebuffers reliably at arbitrary sizes, or beyond the viewport size, you run into GUI software that has weird limitations like all of a sudden you notice that for some reason you can't resize windows beyond the bounds of your screen, etc.

I think Valve actually did this somehow, but I'm not sure how they did it with their old VGUI code.

Anyway, all of that being said, today's modern GUI architectures allow for two choices: fast and lot of elements with no compositing, or slow and really good looking GUIs.

It turns out most people want the latter, but complain about it, because reasonably(!) it's not fast, and hardware keeps getting faster, but GUIs are one of the prime examples of us eating up all of that performance.


Which same thing do you mean? I read your very interesting longer comment below, and you mentioned a lot of differences, and it's not clear which of them you mean arbitrary UI element sizes with scrollable overflow requires you to do. Do you mean that it requires you to use very slow layout algorithms? Maybe the problem there is that CSS wasn't designed to be efficiently implementable; I feel like CSS doesn't usually give me better layouts than LaTeX, which runs orders of magnitude faster than any CSS engine.

Is there a specific reason that scrollable overflow (rather than visible or hidden, which TeX can do) inherently makes the problem much harder than what TeX does? Naively I'd think making the overflow scrollable wouldn't affect the computational complexity at all, but clearly I know a lot less about the space than you do. Are there other things in CSS that are very useful for UI layout but very difficult to implement with submillisecond timing guarantees?

As I understand it, for example, iOS UI autolayout uses Cassowary, which uses the simplex algorithm and I think may at a better point in the expressiveness/performance tradeoff space than CSS. (Hilariously in this context, the original Cassowary paper https://constraints.cs.washington.edu/solvers/cassowary-toch... has an overfull hbox.) But Cassowary is from 02002, and there have been a lot of advancements in constraint solving and optimization algorithms in the last 19 years, both in theory and in practice:

• Z3 wasn't released until 02012;

• SAT/SMT solvers in general have gotten astoundingly better (the SAT Competition started in 02002);

• there have been enormous advances in conic optimization (overview in https://arxiv.org/abs/1709.08841), including open-source libraries like CVXOPT that are widely used (in the context of conic optimization, anyway);

• for more general optimization, automatic differentiation has gone mainstream; and

• there have been huge advances in gradient-descent variants including AdaGrad (02011), RMSprop (02014?), and Adam (02014).

In general, gradient-descent solvers and the interior-point methods often used for conic optimization are anytime algorithms, so you can trade off responsiveness against stability/correctness. If your layout algorithm hasn't converged by the frame deadline, you can just draw the best layout it's come up with so far; it's guaranteed to have all the right stuff in it, just maybe overlapping or in the wrong place, probably only by a few pixels, and displaying something is often better than displaying nothing. The conic optimization algorithms can also guarantee that the layout they deliver is "feasible" in the sense of respecting all the hard constraints in the constraint set; it might just be suboptimal. (Gradient-descent algorithms don't really have hard constraints; they have to simulate them by adding heavy penalties to the loss function.)

I don't know of anybody actually applying this stuff to UI layout; do you?


I would say the most labor intensive aspect is that indefinite element boundaries require you to pretty much have backing layers that are tiled, or if you're OK with the trade off, because sometimes it is OK, to limit elements with scrollable panels to the viewport size, since that's generally what happens with framebuffers or rendertarget graphics APIs, I believe. You also don't always get to receive texture sizes based on your request, so you have to waste some space using power of 2 sizes sometimes.

Yeah, layout algorithms add a lot of overhead and even the fastest ones today still add significant overhead that you have to account for versus manually placing elements by coordinates, scaled or not.

I think TeX is faster generally speaking since it is concerned with fewer dimensions of layout. This is really what you want for a fast layout engine, but not what designers want for really rich UIs.

It's hard for me to compare this without intuitively knowing it to be true, because I would need to measure the actual amount of layout CPU cycles, roughly, to get a sense of how efficient one over the other is.

I'm not really sure about whether CSS inherently produces slower layouts. I wouldn't assume it would, but rather that specific portions of the box model are difficult or not possible to parallelize, and as a result become an obvious bottleneck.

Scrollable overflows aren't inherently slow, although if you have internal target spaces that dynamically change in size very frequently, you would, as one would expect, be thrashing any allocated single framebuffers for that visible space if you took that strategy over tiled rendering.

The best things about layouts in CSS is that CSS 2.1 is so old, everyone has been exposed to it a little bit, even if they can't formally implement it themselves. Developers generally have a little bit of a sense of what display properties do.

That same oldness of CSS 2.1 unfortunately means that the algorithm for box layout hasn't been revisited by many people. If you took every person in the world who had actually implemented a significant portion of the box model rasterizer, we wouldn't easily fill a room.

It's basically descriptive, and left as an exercise to the author to implement. People gave Microsoft a really hard time over this, understandably, but if you gave it a go yourself, you'd understand why.

I'm unfortunately not intimately knowledgeable with Cassowary from an implementation point of view, but I have observed its design and would be very interested to see it placed side by side with a reference implementation of the box model layout, and flexbox in particular, but no one has created such an agnostic rasterizer platform to my knowledge other than browser authors themselves.

I'd love to see more constraint solvers in the wild! We really need some variety out here. I believe some people in the WHATWG have been working on a standard to expose layout events to JavaScript, so maybe we'll have something a little bit more native than Cassowary implemented with JavaScript and `position: absolute` (which is totally an OK override to implement your own layout algorithm!).

So while I am not aware of anyone applying this today, I do actively see the foundations making it possible.

Anyway, basically game UIs use VBOs and no compositing, or viewport dimension constrained framebuffers and a lot of textures. Both of those strategies are nothing like the web, because the web uses tiled backing layers since you never know what a web developer is going to do. Are they going to create a 10000 pixel high rasterized document fragment? No idea. But if we don't tile it, it happens in software. That's slow, so no one does that anymore.

The foundational difference in game UIs is usually in whether or not you're pushing a bunch of vertices and color data to the GPU alongside usually Freetype rasterized type, or a bunch of textures and color data alongside Freetype rasterized type. Yes, I know textures are accompanied by vertex data, too, but the general idea is that you're either reusing that texture data to do cool things like blurs, or you're not, and that's the gist of compositing, which is slow, but pretty.

With browsers you make a bunch of usually 256 or 512(?) px wide and tall tiles, where geometry intersects you run some graphics commands that help draw that portion of the UI, and that's how your compositor works instead. I'm not sure what the industry has measured the ideal tile size to be these days, but I know that's on the mind of engineers, since you want to minimize redraw as much as possible.


I'm so ignorant about modern graphics programming that I don't know what backing layers are or why they should be tiled. Googling doesn't help. Maybe they're offscreen pixel buffers you draw into? I guess VBOs are vertex buffer objects? How can a web developer create a rasterized document fragment at all, much less one ten thousand pixels high? Do you mean, like, creating a huge offscreen <canvas> and drawing stuff on it? (But in that case how would CSS layout be involved at all?)

I think TeX typically takes in the ballpark of ten million machine instructions per page of text (say, 256-1024 words) to do the layout and produce the .dvi output. To guarantee reproducibility, it doesn't use floating-point. (It also doesn't use multithreading because when it was written in the 01980s multiprocessor computers were an exotic rarity, and threading libraries even rarer.) As a test, I downloaded the source files for https://arxiv.org/abs/1709.08841 and rendered them with valgrind --tool=cachegrind latex final_acc.tex, which took 2,440,969,752 instructions (in 1.1 seconds when run without valgrind), about half of them in startup stuff (loading TikZ, hyperref, amsmath, and so on) and the other half to render the 18 pages themselves, about 12000 words of text.

That's about 70 million instructions per page or a hundred thousand instructions per word, including all the parsing and bibliographic database querying but none of the rasterizing --- TeX thinks of letterforms as just boxes and leaves the pixels to dvipdf or xdvi or whatever. (dvi2fax takes 1.7 seconds to convert the .dvi to G3 fax images.) I think this is an unusually slow document. I have the impression that this is about one to two orders of magnitude faster than doing CSS layout on common web pages, but usually the TeX results look better. But maybe I'm wrong about that?

You say you think TeX is concerned with "fewer dimensions of layout". What kind of dimensions are you thinking about? Like, Z-order? My naive view of the pipeline is that you have some kind of "document" or "form" which specifies the things you want to display and how they relate to each other; then a layout algorithm (normally incremental) processes this document into a set of boxes, each of which has dimensions (x, y, w, h, layernum), like TeX in my example above; and then a rasterizer (like dvi2fax) uses these boxes to paint things like character glyphs and icons onto one or more layers (pixel buffers), at least the visible part of them, that then get alpha-composited together, maybe transformed using things like blurs and rotations and reflections and stuff. Am I suffering from some kind of basic misconception? Because that naive vision makes it really puzzling to think that blurring a background could affect the performance of the layout algorithm.

I mean, rendering the blurred pixels themselves takes time, of course, both because computing the filter takes time and because you're rendering two layer pixels for every output pixel. But it doesn't affect the layout. Does it?

I have the impression that, in my naive pipeline above, it's not the rasterizer or compositor that is the bottleneck in web pages, but the layout algorithm. I mean, I remember when scrolling text over a nonscrolling background image in a browser was slow, but that was a long time ago, and alpha-compositing together a screenful of two layers hardly seems like it should stretch the capabilities of even modern CPUs, much less modern GPUs; but, again, maybe I'm suffering from some basic misconceptions about what is involved. Isn't it the same problem as rendering a translucent HUD in a game UI while the 3-D world keeps getting rendered behind it? You don't have to actually iterate over pixels (or, you know, packed 128-bit vectors of pixels) for any images or letterforms that are outside the viewport, do you?

I definitely do not mean to deprecate the difficulty of implementing CSS, much less implementing it efficiently; as you said elsewhere, if it were easy, we'd be seeing new CSS engines that weren't just forks of KHTML. But I suspect that CSS is hard to implement efficiently in large part because CSS is badly designed, and that a better design could simultaneously admit more efficient implementations and be more expressive. Cassowary might be an actual existing example of this.


There are some misconceptions about this. The first issue is people assume you have to render the whole gui every frame even if nothing has changed. This is the usual pattern in games, when you have to redraw the screen anyway, but you do not have to do that, in applications where it doesn't make sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: