Wiki Markup |
---|
{include:Labs Production} {excerpt}AnA recipe for executing a Weka in Hadoop.{excerpt} h2. Project Info * *Status:* InCompatible productwith - Weka version: x.x>= 3.7.10 * *Roadmap:* Future work? - distributed clustering, distributed recommendation engine, pre-processing for text mining, oversampling for minority classes * *Availability:* Open Source - [GitHubSubversion|https://github.com/pentaho/kettle-storm] - [Download the previewsvn.cms.waikato.ac.nz/svn/weka/trunk/packages/internal/distributedWekaHadoop/] \- [Install via Weka's package manager|http://ciweka.pentahowikispaces.com/view/Big%20Data/job/kettle-engine-storm/How+do+I+use+the+package+manager%3F] * *Contact:* mhall or use "Add Comment" at bottom of page * *JIRA:* ??? Some Description {youtube}video{youtube} * Download the Video: [https://pentaho.box.com/video download]\\ \\ h2. Heading 2 Bah blahblah. h2. Try it out! Instructions and code is [available on GitHub|???]. Download from ??? [CI environment|http://sourceforge.net/weka/][http://jira.pentaho.com/browse/DATAMINING-608|http://jira.pentaho.com/browse/DATAMINING-608] [http://jira.pentaho.com/browse/DATAMINING-609|http://jira.pentaho.com/browse/DATAMINING-608] This package for Weka >= 3.7.10 provides several jobs for executing learning tasks inside of Hadoop. These include: # Determining ARFF meta data and summary statisitics # Computing a correlation or covariance matrix # Training a Weka classifier or regressor # Generating randomly shuffled (and stratified) input data chunks # Evaluating a Weka classifier or regressor via cross-validation or a hold-out set # Scoring using a training classifier or regressor A full-featured command line interface is available along with GUI Knowledge Flow components for job orchestration. Predictive models learned in Hadoop are fully compatible with Pentaho Data Integration's "Weka Scoring" transformation step. !TwoJobs.png|align=center,border=1! More information on what is available in the distributed Weka package, and how it is implemented, can be found in a three part blog posting: * [Weka and Hadoop Part 1|http://markahall.blogspot.co.nz/2013/10/weka-and-hadoop-part-1.html] * [Weka and Hadoop Part 2|http://markahall.blogspot.co.nz/2013/10/weka-and-hadoop-part-2.html] * [Weka and Hadoop Part 3|http://markahall.blogspot.co.nz/2013/10/weka-and-hadoop-part-1.html] h2. Try it out\! Open Weka's package manager (GUIChooser-->Tools-->Package manager) and install "distributedWekaHadoop". {scrollbar} |
Page Comparison
General
Content
Integrations