3 Tutorials on Using R with Hadoop
Jeffrey Breen of Atmosphere Research Group presented at tk how to use Apache Hadoop with the statistical programming language R using RHadoop. Hadoop has become practically synonymous with big data and R has become the language of choice for data scientists so it’s natural to want to use the two together.
Breen has made his presentations available on SlideShare and the code and configuration files available on Github.
The first tutorial explains how to install Hadoop on a local virtual machine to help you get familiar with Hadoop:
The second guides you through the process of setting up R and RStudio on an Amazon Web Services EC2 instance:
The final presentation demonstrates how to launch a Hadoop cluster on EC2 using Apache Whirr.
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.