What is H2O in Machine Learning?
H2O.ai is a leader in the 2018 Gartner Magic Quadrant for Data Science and Machine Learning Platforms.
H2O is open-source software for big-data analysis. It is produced by the company H2O.ai. H2O allows users to fit thousands of potential models as part of discovering patterns in data.
The H2O software runs can be called from the statistical package R, Python, and other environments. It is used for exploring and analyzing datasets held in cloud computing systems and in the Apache Hadoop Distributed File System as well as in the conventional operating-systems Linux, macOS, and Microsoft Windows.
The H2O software is written in Java, Python, and R. Its graphical-user-interface is compatible with four browsers: Chrome, Safari, Firefox, and Internet Explorer.
H2O is a Java-based software for data modeling and general computing. The H2O software is many things, but the primary purpose of H2O is as a distributed (many machines), parallel (many CPUs), in memory (several hundred GBs Xmx) processing engines.
There are two levels of parallelism:
- within node
- across (or between) nodes
The goal of H2O is to allow simple horizontal scaling to a given problem in order to produce a solution faster. The conceptual paradigm MapReduce, along with a good concurrent application structure, enable this type of scaling in H2O.
Video for H2O
https://www.youtube.com/watch?v=9W_c2Ec23PM