First I use FileSystem to open a file in hdfs.
FSDataInputStream m_dis = fs.open(...);
Second, read the data in m_dis to a byte array.
byte[] inputdata = new byte[m_dis.available()]; //m_dis.available = 47185920
m_dis.read(inputdata, 0, 20 * 1024 * 768 * 3);
The value returned by m_dis.read() is 131072(2^17), so the data after 131072 is missing. It seems that FSDataInputStream use short to manage its data which confused me a lot. The same code run well in hadoop1.2.1.