Great questions.
1. Yes, running tests in parallel helps. We also cache actions so subsequent runs are much faster (this is disabled in the demo).
2. I agree that testing can be much more reliable and pleasant in some codebases than others. I have not been blessed with these types of codebases in my career. Flakiness is from personal experience automating UI tests specifically and having them break when a new nondeterministic popup modal is added or another engineer breaks an identifier/locator strategy.
That being said, if you like writing UI tests and your codebase supports easily maintaining them, there are some really cool DSLs like Maestro!
> We also cache actions so subsequent runs are much faster
Interesting, what do you cache? How do you know if 1 change needs to be rerun versus another?
>Flakiness is from personal experience automating UI tests specifically and having them break when a new nondeterministic popup modal is added or another engineer breaks an identifier/locator strategy
A modal popping up isn't a flake though, it's often when a button is on screen but the test runner can't seem to find it due to run-loop issues or emulator/simulator issues. If a modal pops up on the screen in a test, how does CamelQA resolve this and how would it know if it's an actual regression or not? If a modal pops up on a screen at the wrong time that _could_ be a real regression, versus a developer forgetting to configure some local state.
1. The AI agent writes an automation script (similar to Appium) that we can replay after the first successful run. If there are issues the AI agent gets pulled back into the loop.
2. You can define acceptance criteria in natural language with camel.