http://sortbenchmark.org/
Doesnt just cover Hadoop, but maybe the methodology will give you an idea of what your'e looking for.
Theres too many variables to pin down a "general" average. Every job will run differently on every cluster, given the machines can be heterogenous builds, with heterogenous configs at the machine level, then the cluster will have configs that may or may not override the machine configs...plus the job submitter can specify runtime variables...
Things like the type of data being processed affect the amount of disk I/O, network traffic required, etc., which are in turn affected by their components...
Throwing more nodes at a problem will usually make it faster, but how much faster depends...
Best way to read your cluster is establish a benchmark operation that models your expected use case (or one of them), then adjust things on the cluster and see what tips the time, spill, network traffic, etc. one way or another.