2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

11/16/2013: Using Pig with Accumulo (building on Jason Trost's work)

https://github.com/medined/accumulo-pig shows how to use Accumulo as a simple source for Apache Pig. (By the way, I haven't tested writing to Accumulo yet) After the quick setup, you'll be able to read from Accumulo using a script like the following. The '\' character represents a line continuation. Be careful with your security because the password is sent via plain text. And stored in your history buffer. Probably this should be changed to use a property file.

pig
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-core.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-fate.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-trace.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/libthrift.jar
register /home/vagrant/accumulo_home/bin/zookeeper/zookeeper-3.4.5.jar
register /vagrant/accumulo-pig/target/accumulo-pig-1.4.0.jar

DATA = LOAD 'accumulo://people?instance=instance&user=root&password=secret\
&zookeepers=affy-master:2181&columns=attribute' \
using org.apache.accumulo.pig.AccumuloStorage() AS (row, cf, cq, cv, ts, val);

HEIGHTS = FOREACH DATA GENERATE row, cq, val;
SORTED_SET = ORDER HEIGHTS BY val DESC;
dump SORTED_SET;


subscribe via RSS