1. The author has prior blog posts talking about this activation function, and apparently it does help to learn binary logic tasks.
2. I doubt this matters at all here. For some architectures having inputs be 0 on average is useful, so the author probably just picked it as the default choice.
2. I doubt this matters at all here. For some architectures having inputs be 0 on average is useful, so the author probably just picked it as the default choice.