04/22/2014: How to Generate a PGP Key on Headless Unix (also called How to Increase Your Entropy)
It took way too long for me to find this series of steps. I ran until a problem generating a gpg key because my system did not have enough entropy.
You can check your entropy using
Run through this procedure at least once manually before trusting it in a batch scenario.
First, let's generate a configuration file for the gpg-key. Note that no pass-phrase is specified. If that's a problem, you can add a 'Passphrase:' line. Also note that the umask changes for better protection of the configuration file.
Now we start a background task to generation hashes. This task increases your entropy. And we store its PID for later use.
Next generate the key. Make sure you look at the log files at least once to see if any errors were generated. When done, kill the entropy-generating process and delete the configuration file.
I wish I still had the URL were I found this tidbit. Sorry.
cat /proc/sys/kernel/random/entropy_avail
Run through this procedure at least once manually before trusting it in a batch scenario.
First, let's generate a configuration file for the gpg-key. Note that no pass-phrase is specified. If that's a problem, you can add a 'Passphrase:' line. Also note that the umask changes for better protection of the configuration file.
umask 0277 cat << EOF > /tmp/$USER-gpg-genkey.conf %echo Generating a package signing key Key-Type: DSA Key-Length: 1024 Subkey-Type: ELG-E Subkey-Length: 2048 Name-Real: `hostname --fqdn` Name-Email: $USER@`hostname --fqdn` Expire-Date: 0 %commit %echo Done EOF umask 0002
Now we start a background task to generation hashes. This task increases your entropy. And we store its PID for later use.
(find / -xdev -type f -exec sha256sum {} >/dev/null \; 2>&1) & export ENTROPY=$!
Next generate the key. Make sure you look at the log files at least once to see if any errors were generated. When done, kill the entropy-generating process and delete the configuration file.
gpg --batch --gen-key /tmp/$USER-gpg-genkey.conf > gpg-keygen.log 2> gpg-keygen_error.log ps -ef | grep find | awk '{ print $2 }' | grep ${ENTROPY} && kill ${ENTROPY} rm /tmp/$USER-gpg-genkey.conf
I wish I still had the URL were I found this tidbit. Sorry.
04/17/2014: Using RMarkdown in an Analysis Archive and Retrieval System
RMarkdown is good stuff.
Hopefully, you've already heard of RMarkdown. If not, you'll understand it pretty quickly by looking at this simple example:
http://rpubs.com/medined/replacing_part_of_time_series_using_time_based_selection
Essentially, you mix R code with Markdown markup to create a 'living' document. The R code is executed when the page is displayed. The full power of R (and all of its extensions) can be used. There are many examples of this online.
RMarkdown pages can be computer-generated. Imagine if any given analytic documented the intermediate steps from Data Load to Final Visualization using RMarkdown? I bet user confidence in the final product would increase. It would also be trivial for the document to be duplicated and tweaked (draft mode) before being republished. Since RMarkdown is text-based, you could provide human-readable diff reports between analyses. Another advantage of this text-based system would be full-text search across all analyses.
I could also point out the value in being able to produce an analytic report without needing to know Java, Python, or another programming language. Just knowing the math is complex enough.
Update: http://studio.sketchpad.cc/sp/pad/view/ro.9QNw0rsxwki4J/rev.480 - The archive timeline widget allows visitors to view all versions of the source document.
Update: You can do this same kind of thing with Python code. Check out http://ipython.org/notebook.html.
I'm not saying R is the answer to all problems. But this idea of archivable, diffable analytic solutions was interesting.
http://rpubs.com/medined/replacing_part_of_time_series_using_time_based_selection
Essentially, you mix R code with Markdown markup to create a 'living' document. The R code is executed when the page is displayed. The full power of R (and all of its extensions) can be used. There are many examples of this online.
RMarkdown pages can be computer-generated. Imagine if any given analytic documented the intermediate steps from Data Load to Final Visualization using RMarkdown? I bet user confidence in the final product would increase. It would also be trivial for the document to be duplicated and tweaked (draft mode) before being republished. Since RMarkdown is text-based, you could provide human-readable diff reports between analyses. Another advantage of this text-based system would be full-text search across all analyses.
I could also point out the value in being able to produce an analytic report without needing to know Java, Python, or another programming language. Just knowing the math is complex enough.
Update: http://studio.sketchpad.cc/sp/pad/view/ro.9QNw0rsxwki4J/rev.480 - The archive timeline widget allows visitors to view all versions of the source document.
Update: You can do this same kind of thing with Python code. Check out http://ipython.org/notebook.html.
I'm not saying R is the answer to all problems. But this idea of archivable, diffable analytic solutions was interesting.