11/16/2013: Using Pig with Accumulo (building on Jason Trost's work)
https://github.com/medined/accumulo-pig shows how to use Accumulo as a simple source for Apache Pig. (By the way, I haven't tested writing to Accumulo yet)
After the quick setup, you'll be able to read from Accumulo using a script like the following. The '\' character represents a line continuation. Be careful with your security because the password is sent via plain text. And stored in your history buffer. Probably this should be changed to use a property file.
pig
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-core.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-fate.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/accumulo-trace.jar
register /home/vagrant/accumulo_home/bin/accumulo/lib/libthrift.jar
register /home/vagrant/accumulo_home/bin/zookeeper/zookeeper-3.4.5.jar
register /vagrant/accumulo-pig/target/accumulo-pig-1.4.0.jar
DATA = LOAD 'accumulo://people?instance=instance&user=root&password=secret\
&zookeepers=affy-master:2181&columns=attribute' \
using org.apache.accumulo.pig.AccumuloStorage() AS (row, cf, cq, cv, ts, val);
HEIGHTS = FOREACH DATA GENERATE row, cq, val;
SORTED_SET = ORDER HEIGHTS BY val DESC;
dump SORTED_SET;
11/12/2013: Using Accumulo Proxy From Python
Using Accumulo Proxy From Python
Start the Proxy Server
- Start an Accumulo cluster using https://github.com/medined/Accumulo_1_5_0_By_Vagrant
- vagrant ssh master
- cd /home/vagrant/accumulo_home/bin/accumulo/proxy
- edit proxy.properties so that instance=instance and zookeepers=affy-master:2181
- accumulo proxy -p proxy.properties
- cd /home/vagrant/software
- Download the thrift gz from http://www.apache.org/dyn/closer.cgi?path=/thrift/0.9.1/thrift-0.9.1.tar.gz
- sudo apt-get install -y libboost-dev libboost-test-dev libboost-program-options-dev libevent-dev automake libtool flex bison pkg-config g++ libssl-dev
- sudo apt-get install -y ruby-full ruby-dev librspec-ruby rake rubygems libdaemons-ruby libgemplugin-ruby mongrel
- sudo apt-get install -y python-dev python-twisted
- sudo apt-get install -y libbit-vector-perl
- tar xvfz thrift-0.9.1.tar.gz
- cd thrift-0.9.1
- ./configure
- make
- sudo make install
- thrift -version
- cd lib/py
- sudo python setup.py install
- cd /home/vagrant/software
- thrift --gen py $ACCUMULO_HOME/proxy/thrift/proxy.thrift
- cd /home/vagrant/accumulo_home/software/accumulo
- export PYTHONPATH=/home/vagrant/accumulo_home/gen-py
- python proxy/examples/python/TestClient.py