2017-08-02T18:19:01+03:00[Europe/Moscow] en true KNIME, Rattle GUI, Vowpal Wabbit, Waffles (machine learning), Wolfram Language, H2O (software), Programming with Big Data in R, Dlib, LIBSVM, MLPACK (C++ library), Torch (machine learning), NetOwl, OpenNN, TensorFlow, Folding@home, Scikit-learn, Deeplearning4j, Piranha (software), SolveIT Software, DADiSP, Ilastik, CNTK, Encog, Mallet (software project), Feature Selection Toolbox, GraphLab, Gremlin (programming language), KXEN Inc., SequenceL, Jubatus, Neural Designer, Xgboost, Weka (machine learning), FICO, Apache Flume, Distributed R, Fluentd, Shogun (toolbox), Comparison of deep learning software, Aphelion (software), Julia (programming language), Mlpy, Angoss, Oracle Data Mining, UIMA, SAS (software), ND4S, Tanagra (machine learning), Pipeline Pilot, Wolfram Mathematica, R (programming language), GNU Octave, Apache SystemML, MeeMix, RapidMiner, General Architecture for Text Engineering, Massive Online Analysis, Apache Giraph, MATLAB, ND4J (software), Orange (software), CellCognition, SPSS Modeler, Apache Mahout, Deep Web Technologies, ELKI flashcards
Data mining and machine learning software

Data mining and machine learning software

  • KNIME
    KNIME (pronounced /naɪm/), the Konstanz Information Miner, is an open source data analytics, reporting and integration platform.
  • Rattle GUI
    Rattle GUI is a free and open source software (GNU GPL v2) package providing a graphical user interface (GUI) for data mining using the R statistical programming language.
  • Vowpal Wabbit
    Vowpal Wabbit (also known as "VW") is an open source fast out-of-core learning system library and program developed originally at Yahoo! Research, and currently at Microsoft Research.
  • Waffles (machine learning)
    Waffles is a collection of command-line tools for performing machine learning operations developed at Brigham Young University.
  • Wolfram Language
    The Wolfram Language, a general multi-paradigm programming language developed by Wolfram Research, is the programming language of Mathematica and the Wolfram Programming Cloud.
  • H2O (software)
    H2O is open-source software for big-data analysis.
  • Programming with Big Data in R
    Programming with Big Data in R (pbdR) is a series of R packages and an environment for statistical computing with Big Data by using high-performance statistical computation.
  • Dlib
    Dlib is a general purpose cross-platform software library written in the programming language C++.
  • LIBSVM
    LIBSVM and LIBLINEAR are two popular open source machine learning libraries, both developed at the National Taiwan University and both written in C++ though with a C API.
  • MLPACK (C++ library)
    mlpack is a machine learning software library for C++, built on top of the Armadillo library.
  • Torch (machine learning)
    Torch is an open source machine learning library, a scientific computing framework, and a script language based on the Lua programming language.
  • NetOwl
    NetOwl is a suite of multilingual text and entity analytics products that analyze Big Data in the form of text data – reports, web, social media, etc.
  • OpenNN
    OpenNN (Open Neural Networks Library) is a software library written in the C++ programming language which implements neural networks, a main area of deep learning research.
  • TensorFlow
    TensorFlow is an open source software library for machine learning in various kinds of perceptual and language understanding tasks.
  • Folding@home
    Folding@home (FAH or F@h) is a distributed computing project for disease research that simulates protein folding, computational drug design, and other types of molecular dynamics.
  • Scikit-learn
    Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language.
  • Deeplearning4j
    Deeplearning4j is a deep learning programming library written for Java and the Java virtual machine (JVM) and a computing framework with wide support for deep learning algorithms.
  • Piranha (software)
    Piranha is a text mining system developed for the United States Department of Energy (DOE) by Oak Ridge National Laboratory (ORNL).
  • SolveIT Software
    SolveIT Software Pty Ltd is a provider of advanced planning and scheduling enterprise software for supply and demand optimisation and predictive modelling.
  • DADiSP
    DADiSP (Data Analysis and Display, pronounced day-disp) is a numerical computing environment developed by DSP Development Corporation which allows one to display and manipulate data series, matrices and images with an interface similar to a spreadsheet.
  • Ilastik
    ilastik is a user-friendly free open source software for image classification and segmentation.
  • CNTK
    Computational Network Toolkit, or CNTK, is a deep learning framework developed by Microsoft Research.
  • Encog
    Encog is a machine learning framework available for Java, .
  • Mallet (software project)
    MALLET is a Java "MAchine Learning for LanguagE Toolkit".
  • Feature Selection Toolbox
    Feature Selection Toolbox (FST) is software primarily for feature selection in the machine learning domain, written in C++, developed at the Institute of Information Theory and Automation (UTIA), of the Czech Academy of Sciences.
  • GraphLab
    Turi is a graph-based, high performance, distributed computation framework written in C++.
  • Gremlin (programming language)
    Gremlin is a graph traversal language and virtual machine developed by Apache TinkerPop of the Apache Software Foundation.
  • KXEN Inc.
    KXEN was an American software company founded in June 1998 by Roger Haddad and Michel Bera, based on an original idea from Léon Bottou using Dr.
  • SequenceL
    SequenceL is a general purpose functional programming language and auto-parallelizing (Parallel computing) tool set, whose primary design objectives are performance on multi-core processor hardware, ease of programming, platform portability/optimization, and code clarity and readability.
  • Jubatus
    Jubatus is an open source online machine learning and distributed computing framework that is developed at Nippon Telegraph and Telephone and Preferred Infrastructure.
  • Neural Designer
    Neural Designer is a software tool for data mining based on machine learning techniques, a main area of artificial intelligence research.
  • Xgboost
    Xgboost is an open-source software library which provides the Gradient boosting framework for C++, Java,Python,R, andJulia.
  • Weka (machine learning)
    Waikato Environment for Knowledge Analysis (Weka) is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand.
  • FICO
    FICO (NYSE: FICO), originally Fair, Isaac and Company, is a data analytics company based in San Jose, California focused on credit rating services.
  • Apache Flume
    Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
  • Distributed R
    Distributed R is an open source, high-performance platform for the R language.
  • Fluentd
    Fluentd is a cross platform open source data collection solution originally developed at Treasure Data.
  • Shogun (toolbox)
    Shogun is a free, open source toolbox written in C++.
  • Comparison of deep learning software
    The following table compares some of the most popular software frameworks, libraries and computer programs for deep learning.
  • Aphelion (software)
    The Aphelion Imaging Software Suite is a software suite that includes three base products (i.e., Aphelion Lab, Aphelion Dev, and Aphelion SDK) for addressing image processing and image analysis applications.
  • Julia (programming language)
    Julia is a high-level dynamic programming language designed to address the requirements of high-performance numerical and scientific computing while also being effective for general-purpose programming, web use or as a specification language.
  • Mlpy
    Mlpy is a Python, open source, machine learning library built on top of NumPy/SciPy, the GNU Scientific Library and it makes an extensive use of the Cython language.
  • Angoss
    Angoss Software Corporation, headquartered in Toronto, Ontario, Canada, with offices in the United States and UK, is a provider of predictive analytics systems through software licensing and services.
  • Oracle Data Mining
    Oracle Data Mining (ODM) is an option of Oracle Corporation's Relational Database Management System (RDBMS) Enterprise Edition (EE).
  • UIMA
    UIMA (Pronounced as ″u e ma″) stands for Unstructured Information Management Architecture.
  • SAS (software)
    SAS (Statistical Analysis System) is a software suite developed by SAS Institute for advanced analytics, multivariate analyses, business intelligence, data management, and predictive analytics.
  • ND4S
    ND4S is a free, open-source extension of the Scala programming language operating on the Java Virtual Machine—though it is compatible with both Java and Clojure.
  • Tanagra (machine learning)
    Tanagra is a free suite of machine learning software for research and academic purposesdeveloped by Ricco Rakotomalala at the Lumière University Lyon 2, France.
  • Pipeline Pilot
    Pipeline Pilot is the authoring tool for the Accelrys Enterprise Platform.
  • Wolfram Mathematica
    Wolfram Mathematica (sometimes referred to as Mathematica) is a symbolic mathematical computation program, sometimes called a computer algebra program, used in many scientific, engineering, mathematical, and computing fields.
  • R (programming language)
    R is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
  • GNU Octave
    GNU Octave is software featuring a high-level programming language, primarily intended for numerical computations.
  • Apache SystemML
    Apache SystemML (incubating) is a flexible machine learning system that automatically scales to Spark and Hadoop clusters.
  • MeeMix
    MeeMix Ltd is a company specializing in personalizing media-related content recommendations, discovery and advertising for the telecommunication industry, founded in 2006.
  • RapidMiner
    RapidMiner is a software platform developed by the company of the same name that provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics.
  • General Architecture for Text Engineering
    General Architecture for Text Engineering or GATE is a Java suite of tools originally developed at the University of Sheffield beginning in 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for many natural language processing tasks, including information extraction in many languages.
  • Massive Online Analysis
    MOA (Massive Online Analysis) is a free open-source software specific for Data stream mining with Concept drift.
  • Apache Giraph
    Apache Giraph is an Apache project to perform graph processing on big data.
  • MATLAB
    MATLAB (matrix laboratory) is a multi-paradigm numerical computing environment and fourth-generation programming language.
  • ND4J (software)
    ND4J is a scientific computing library, written in the programming language C++, operating on the Java virtual machine (JVM), and compatible with the languages Java, Scala, and Clojure.
  • Orange (software)
    Orange is a free software machine learning and data mining package (written in Python).
  • CellCognition
    CellCognition is a free open-source computational framework for quantitative analysis of high-throughput fluorescence microscopy (time-lapse) images in the field of bioimage informatics and systems microscopy.
  • SPSS Modeler
    IBM SPSS Modeler is a data mining and text analytics software application from IBM.
  • Apache Mahout
    Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily in the areas of collaborative filtering, clustering and classification.
  • Deep Web Technologies
    Deep Web Technologies is a software company that specializes in mining the Deep Web — the part of the Internet that is not directly searchable through ordinary web search engines.
  • ELKI
    ELKI (for Environment for DeveLoping KDD-Applications Supported by Index-Structures) is a knowledge discovery in databases (KDD, "data mining") software framework developed for use in research and teaching by the database systems research unit of Professor Hans-Peter Kriegel at the Ludwig Maximilian University of Munich, Germany.