1) New Relic, Datadog, Pingdom, and lots of others have arisen recently. Additionally, you should think about having your own health check responder(s).
2) It often depends on why your sites went down. It also depends on what sort of infrastructure you have for your application. Some server setups can manage this if a single server processor (such as a unicorn worker) goes down or hangs, but if the whole thing goes belly up then you may have something quite a bit more catastrophic.