Perform powerful statistical analysis by embedding
the R language in PostgreSQL.
I took two introductory statistics
classes in graduate school and found
that I really liked the subject. It wasn’t
always intuitive, but it always was
interesting, and it really helped me
to put research, polling and many
newspaper stories in a new light. I don’t
do statistical analysis every day, but
it’s a powerful tool for organizing and
working with large sets of data, and for
finding correlations among seemingly
disparate pieces of information.
For the courses I took, my university
told me to buy SPSS, a commercial
program that helps you perform
statistical calculations. Being an open-source kind of guy, I discovered R—a
programming language aimed at
helping people solve problems involving
calculations and statistics. R is a full-fledged language, and it theoretically can
be used in a wide variety of situations.
But, it was designed primarily for use in
mathematics and statistical work, and
that’s where it really shines.
I managed to get through the class
just fine using R instead of SPSS. The
32 / SEPTEMBER 2012 / WWW.LINUXJOURNAL.COM
quality of R’s documentation and the
intuitive feel of the language, especially
for someone experienced with Ruby and
Python, meant that when my instructors
demonstrated how to do something in
SPSS, I managed to find the appropriate
parallel in R, and even get a result before
they had finished their explanation.