<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cognizant Transmutaion &#187; Macintosh</title>
	<atom:link href="http://blog.ibd.com/tag/macintosh/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.ibd.com</link>
	<description>Internet Bandwidth Development: Composting the Internet for over Two Decades</description>
	<lastBuildDate>Fri, 18 Jun 2010 02:00:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Bonjour / AVAHI &amp; Netatalk to share files files between Ubuntu 10.4 &amp; Mac OS X</title>
		<link>http://blog.ibd.com/sysadmin/bonjour-avahi-netatalk-to-share-files-files-between-ubuntu-10-4-mac-os-x/</link>
		<comments>http://blog.ibd.com/sysadmin/bonjour-avahi-netatalk-to-share-files-files-between-ubuntu-10-4-mac-os-x/#comments</comments>
		<pubDate>Sun, 09 May 2010 05:57:08 +0000</pubDate>
		<dc:creator>Robert J Berger</dc:creator>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[Mac OS X]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://blog.ibd.com/?p=574</guid>
		<description><![CDATA[<p>It use to be somewhat difficult to have Filesystems on an Ubuntu system show up on the Mac Finder the same way that other Mac Filesystems would show up. There has been the Open Source Unix implementation of the Apple File System (afp) but for a long time the Ubuntu packages were not properly configured to [...]]]></description>
			<content:encoded><![CDATA[<p>It use to be somewhat difficult to have Filesystems on an Ubuntu system show up on the Mac Finder the same way that other Mac Filesystems would show up. There has been the Open Source Unix implementation of the Apple File System (afp) but for a long time the Ubuntu packages were not properly configured to work transparently with modern (Snow Leopard) Mac OS X.</p>
<p>One blog post, <a href="http://www.kremalicious.com/2008/06/ubuntu-as-mac-file-server-and-time-machine-volume/" target="_blank">HowTo: Make Ubuntu A Perfect Mac File Server And Time Machine Volume</a> did a great job going through all the steps needed to build Netatalk from source and configure it to work very transparently with Ubuntu releases of the past. But with the Ubuntu 10.4 Lucid release, the Netatalk that is in the Ubuntu repository is built and configure to support transparent Apple File Protocol based file sharing.</p>
<p>But there are a few configuration issues, mainly with the Unix implementation of Bonjour resource discovery protocol, that still needs to be done to make it so you can see your Ubuntu Filesystems on your Mac&#8217;s Finder like other Macintosh instances. Also we&#8217;ll see how to make it so that the Ubuntu instance will show up as an ssh server as well.</p>
<h2>Installing Packages</h2>
<p>You will need to install the following packages onto your Ubuntu 10.4 instance. This assumes that you already did a clean install of Ubuntu 10.4 and used the update manager to bring it up to date. If you have already installed some of these, it should not be a problem.</p>
<h3>Install ssh server</h3>
<p>I can&#8217;t believe that ubuntu doesn&#8217;t install an ssh server by default. But in any case its pretty easy. This is not needed to use netatalk but I wanted to make ssh and netatalk to work and be available via bonjour.</p>
<pre><code>sudo apt-get install openssh-server</code></pre>
<p>Then you&#8217;ll need to set up your authorized keys on the ubuntu server. In your home directory do the following:</p>
<pre><code>mkdir -p .ssh
# Copy your public key[s] to .ssh/authorized_keys (not shown here)
# Set the permissions to only allow your user to access the .ssh directory and files in there
chmod -R og-rwx .ssh</code></pre>
<h3>Install Netatalk</h3>
<pre><code>sudo apt-get install netatalk</code></pre>
<h4>Configure Netatalk</h4>
<p>You don&#8217;t need to change any of the configuration files for netatalk. The defaults will enable the sharing of your home directory. If you want to share any additional filesystems from your Ubuntu instance to your Macs, you can add them to the <em>/etc/netatalk/AppleVolumes.default</em>. That file has explanations of al the options.</p>
<p>You may want to change the default last item in /etc/netatalk/AppleVolumes.default from:</p>
<pre>~/			"Home Directory"</pre>
<p>to something like:</p>
<pre>~/ "$h_$u Home Directory" options:upriv,usedots</pre>
<p>This will change the name that shows up in listing to be &#8220;<em>hostname_username Home Directory</em>&#8221; and will use Unix Privilages. Most importantly the usedots says to not do Hex translation of dot files. If you don&#8217;t do this, you&#8217;ll see things like<br />
<code>:2e_somefilename</code> instead of <code>.somefilename</code> where filenames start with &#8220;dot&#8221;.</p>
<h3>Configure AVAHI</h3>
<p>AVAHI is probably already installed if you did a standard installation.</p>
<p>Copy the avahi ssh service configuration into <em>/etc/avahi/services</em></p>
<pre><code>sudo cp /usr/share/doc/avahi-daemon/examples/ssh.service /etc/avahi/services/</code></pre>
<p>Create an avahi afpd service configuration by creating a file <em>/etc/avahi/services/afpd.service</em> with the following content:</p>
<pre><code>&lt;?xml version="1.0" standalone='no'?&gt;&lt;!--*-nxml-*--&gt;
&lt;!DOCTYPE service-group SYSTEM "avahi-service.dtd"&gt;
&lt;service-group&gt;
  &lt;name replace-wildcards="yes"&gt;%h&lt;/name&gt;
  &lt;service&gt;
    &lt;type&gt;_afpovertcp._tcp&lt;/type&gt;
    &lt;port&gt;548&lt;/port&gt;
  &lt;/service&gt;
  &lt;service&gt;
    &lt;type&gt;_device-info._tcp&lt;/type&gt;
    &lt;port&gt;0&lt;/port&gt;
    &lt;txt-record&gt;model=Xserve&lt;/txt-record&gt;
  &lt;/service&gt;
&lt;/service-group&gt;
</code></pre>
<p>You should now be able to see the Ubuntu host in your Finder under the SHARED section on the left side of the Finder. You should also see your Ubuntu host in the &#8220;New Remote Connection&#8221; window of the Mac Terminal app (CMD-SHIFT-K) if you select the &#8220;Secure Shell (ssh)&#8221; Service.</p>
<p>If you don&#8217;t see the Ubuntu hostname in the FInder or in the Terminal New Remote Connection service,  restart the avahi-daemon service:</p>
<pre><code>sudo restart avahi-daemon</code></pre>
<h2>TimeMachine Support</h2>
<p>The new Ubuntu Netatalk package is supposed to also support TimeMachine storage. You can enable this in <em>/etc/netatalk/AppleVolumes.default</em> and add <em>tm</em> as an option to the filesystems that is published in this file. I have not tried this and many sources consider this a risky way to store Time Machine backups.</p>
<h2>Troubleshooting</h2>
<p>You should make sure that there is at least one afpd process running on the Ubuntu instance. You can see the log info in <em>/var/log/daemon.log</em>.</p>
<p>That&#8217;s it!</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://blog.ibd.com/sysadmin/bonjour-avahi-netatalk-to-share-files-files-between-ubuntu-10-4-mac-os-x/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>HBase/Hadoop on Mac OS X (Pseudo-Distributed)</title>
		<link>http://blog.ibd.com/scalable-deployment/hbase-hadoop-on-mac-ox-x/</link>
		<comments>http://blog.ibd.com/scalable-deployment/hbase-hadoop-on-mac-ox-x/#comments</comments>
		<pubDate>Mon, 03 May 2010 03:50:13 +0000</pubDate>
		<dc:creator>Robert J Berger</dc:creator>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[Mac OS X]]></category>

		<guid isPermaLink="false">http://blog.ibd.com/?p=565</guid>
		<description><![CDATA[<p>I wanted to do some experimenting with various tools for doing Hadoop and HBase activities and didn&#8217;t want to have to bother making it work with our Cluster in the Cloud. I just wanted a simple experimental environment on my Macbook Pro running Snow Leopard Mac OS X.</p>
<p>So I thought it was time to revisit installing [...]]]></description>
			<content:encoded><![CDATA[<p>I wanted to do some experimenting with various tools for doing Hadoop and HBase activities and didn&#8217;t want to have to bother making it work with our Cluster in the Cloud. I just wanted a simple experimental environment on my Macbook Pro running Snow Leopard Mac OS X.</p>
<p>So I thought it was time to revisit installing Hadoop and HBase on the Mac using the latest versions of everything. This will be deployed as Psuedo-Distributed mode native to Mac OS X. Some folks actually create a set of Linux VMs with a full Hadoop/HBase stack and run that on the Mac, but that is a bit of overkill for now.</p>
<p>These instructions mainly follow the standard instructions for <a href="http://hadoop.apache.org/common/docs/current/quickstart.html" target="_blank">Apache Hadoop</a> and <a href="http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#pseudo-distrib" target="_blank">Apache HBase</a></p>
<h2>Prerequisits</h2>
<p>Mac OS X Xcode developer tools which includes Java 1.6.x. You can get this for free from the <a href="https://developer.apple.com/mac/" target="_blank">Apple Mac Dev Center</a>. You have to become a member but there is a free membership available.</p>
<h2>Download and Unpack Latest Distros</h2>
<p>You can get a link to a mirror for Hadoop via the <a href="http://www.apache.org/dyn/closer.cgi/hadoop/core/" target="_blank">Hadoop Apache Mirror link</a> and for Hbase at the <a href="http://www.apache.org/dyn/closer.cgi/hadoop/hbase/" target="_blank">HBase Apache Mirror link</a>. Each of those links will bring you to a suggested link to a mirror for Hadoop or HBase. Once you click on the suggest link, it will bring you to a mirror with the recent releases. You can click on the <em>stable</em> link which will then bring you to a directory that has the latest stable Hadoop (as of this writing: hadoop-0.20.2.tar.gz) or HBase (as of this writing: hbase-0.20.3.tar.gz ). Click on those tar.gz files to download them.</p>
<p>I am going to keep the distros in ~/work/pkgs. I usually create a directory ~/work/pkgs and unpack the tar files there as numbered versions and then create symbolic links to them in ~/work. But you can do this all in any directory that you can control.:</p>
<pre><code>cd ~/work
mkdir -p pkgs
cd pkgs
tar xvzf hadoop-0.20.2.tar.gz
tar xvzf hbase-0.20.3.tar.gz
cd ..
ln -s pkgs/hadoop-0.20.2 hadoop
ln -s pkgs/hbase-020.3 hbase
mkdir -p hadoop/logs
mkdir -p hbase/logs</code></pre>
<p>Now you can have your tools all access ~/work/hadoop or ~/work/hbase and not care what version it is. You can update to later version just by downloading, untarring the distro and then just change the symbolic links.</p>
<h2>Configure Hadoop</h2>
<p>All the configuration files mentioned here will be in <em>~/work/hadoop/conf.</em> In this example we are assuming that the Hadoop servers will only be accessed from this <em>localhost</em>. If you need to make it accessable from other hosts or VMs on your lan that support Bonjour, you could use the bonjour name  (ie. the name of your mac followed by .local such as <em>mymac.local</em>) instead of <em>localhost</em> in the following Hadoop and HBase configuraitons</p>
<h3>hadoop-env.sh</h3>
<p>Mainly need to tell Hadoop where your JAVA_HOME is.</p>
<p>Add the following line below the commented out JAVA_HOME line is in hadoop-env.sh</p>
<pre><code>export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home</code></pre>
<h3>core-site.xml</h3>
<pre><code>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;fs.default.name&lt;/name&gt;
    &lt;value&gt;hdfs://localhost:9000&lt;/value&gt;
  &lt;/property&gt;
&lt;/configuration&gt;</code></pre>
<h3>hdfs-site.xml</h3>
<pre><code>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;dfs.replication&lt;/name&gt;
    &lt;value&gt;1&lt;/value&gt;
  &lt;/property&gt;
&lt;/configuration&gt;</code></pre>
<h3>mapred-site.xml</h3>
<pre><code>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;mapred.job.tracker&lt;/name&gt;
    &lt;value&gt;localhost:9001&lt;/value&gt;
  &lt;/property&gt;
&lt;/configuration&gt;</code></pre>
<h3>Make sure you can ssh without a password to the hostname used in the configs</h3>
<p>The Hadoop and Hbase start/stop scripts use ssh to access the various servers. In this case of doing a Pseudo-Distributed mode, everything is running on the <em>localhost</em>, but we still need to allow the scripts to ssh to the localhost.</p>
<h4>Check that you can ssh to the <em>localhost</em> (or whatever hostname you used in the above configs)</h4>
<p>We&#8217;re assuming that we&#8217;ll be running the Hadoop/HBase servers as the same user as our login. You can set things up to run as the hadoop user, but its kind of complicated on Mac OS X. See the section<em> File System Layout</em> in an earlier post <em><a href="http://blog.ibd.com/scalable-deployment/hadoop-hdfs-and-hbase-on-ubuntu/" target="_blank">Hadoop, HDFS and Hbase on Ubuntu &amp; Macintosh Leopard</a>.</em> That section and a few other points thru that post describe how to create and use a hadoop user to run the Hadoop and HBase servers.</p>
<p>Back to just doing this as our own user. Test that you can ssh to the <em>localhost</em> without a password:</p>
<pre>ssh localhost</pre>
<p>If you see something like the following paragraph  that ends up with a password prompt, then you need to add a key to your ssh setup that does not need a password (you may need to say yes if you are asked if you want to continue connecting).</p>
<pre>The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 3c:5d:6a:39:64:78:02:9d:a3:c9:69:68:50:23:71:eb.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Password:</pre>
<p>To create a passwordless key and add it to your set of authorized keys that can access your host, do the following (as yourself, not as root. The id_dsa file name can be arbitrary):</p>
<pre>ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa_for_hadoop
cat ~/.ssh/id_dsa_for_hadoop.pub &gt;&gt; ~/.ssh/authorized_keys</pre>
<p>If you have strong alternative opinions on how to set up your own keys to accomplish the same thing please do it your own way. This is just the basic way of doing a passwordless ssh. You may want to use a key you already have lying around or some other mechanism.</p>
<h3>Start Hadoop</h3>
<h4>One time format of  Hadoop File System</h4>
<p>Only once, before the first time you use Hadoop, you have to create a formated Hadoop File System. Don&#8217;t do this again once you have data in your Hadoop file system as it will erase anything you might have saved there. You may have to do this command again if somehow you screw up your file system. But its not something to do lightly the second time.</p>
<pre>~/work/hadoop/bin/hadoop namenode -format</pre>
<p>If all goes well, you should see something like:</p>
<pre>10/05/02 18:45:04 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = Psion.local/192.168.50.16
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/05/02 18:45:04 INFO namenode.FSNamesystem: fsOwner=rberger,rberger,admin,com.apple.access_screensharing,_developer,_lpoperator,_lpadmin,_appserveradm,_appserverusr,localaccounts,everyone,com.apple.sharepoint.group.2,com.apple.sharepoint.group.3,dev,com.apple.sharepoint.group.1,workgroup
10/05/02 18:45:04 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/02 18:45:04 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/02 18:45:04 INFO common.Storage: Image file of size 97 saved in 0 seconds.
10/05/02 18:45:04 INFO common.Storage: Storage directory /tmp/hadoop-rberger/dfs/name has been successfully formatted.
10/05/02 18:45:04 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Psion.local/192.168.50.16
************************************************************/</pre>
<h4>Starting and stopping Hadoop</h4>
<p>Now you can start Hadoop. You will use this command to start Hadoop in general:</p>
<pre>~/work/hadoop/bin/start-all.sh</pre>
<p>You can stop Hadoop with the command</p>
<pre>~/work/hadoop/bin/stop-all.sh</pre>
<p>But remember if you are running HBase, stop that first, then stop Hadoop.</p>
<h3>Making sure Hadoop is working</h3>
<p>You can see the Hadoop logs in ~/work/hadoop/logs</p>
<p>You should be able to see the Hadoop Namenode web interface at <a href="http://localhost:50070/" target="_blank">http://localhost:50070/</a> and the JobTracker Web Interface at <a href="http://localhost:50030/" target="_blank">http://localhost:50030/</a>. If not, check that you have 5 java processes running where each of those java processes have one of the following as their last command line (as seen from a <code>ps ax | grep hadoop</code> command) :</p>
<pre>org.apache.hadoop.mapred.JobTracker
org.apache.hadoop.hdfs.server.namenode.NameNode
org.apache.hadoop.mapred.TaskTracker
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
org.apache.hadoop.hdfs.server.datanode.DataNode</pre>
<p>If you do not see these 5 processes, check the logs in ~work/hadoop/logs/*.{out,log} for messages that might give you a hint as to what went wrong.</p>
<h4>Run some example map/reduce jobs</h4>
<p>The Hadoop distro comes with some example / test map / reduce jobs. Here we&#8217;ll run them and make sure things are working end to end.</p>
<pre><code>cd ~/work/hadoop
# Copy the input files into the distributed filesystem
# (there will be no output visible from the command):
bin/hadoop fs -put conf input
# Run some of the examples provided:
# (there will be a large amount of INFO statements as output)
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
# Examine the output files:
bin/hadoop fs -cat output/part-00000
</code></pre>
<p>The resulting output should be something like:</p>
<pre>3	dfs.class
2	dfs.period
1	dfs.file
1	dfs.replication
1	dfs.servers
1	dfsadmin
1	dfsmetrics.log</pre>
<h2>Configuring HBase</h2>
<p>The following config files all reside in <em>~/work/hbase/conf</em>. As mentioned earlier, use a FQDN or a Bonjour name instead of localhost if you need remote clients to access HBase. But if you don&#8217;t use localhost here, make sure you do the same in the Hadoop config.</p>
<h3>hbase-env.sh</h3>
<p>Add the following line below the commented out JAVA_HOME line is in hbase-env.sh</p>
<pre><code>export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home</code></pre>
<p>Add the following line below the commented out HBASE_CLASSPATH= line</p>
<pre><code>export HBASE_CLASSPATH=${HOME}/work/hadoop/conf</code></pre>
<h3>hbase-site.xml</h3>
<pre><code>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;
&lt;?xml version="1.0"?&gt;&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;
&lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;hbase.rootdir&lt;/name&gt;
    &lt;value&gt;hdfs://localhost:9000/hbase&lt;/value&gt;
    &lt;description&gt;The directory shared by region servers.
    &lt;/description&gt;
  &lt;/property&gt;
&lt;/configuration&gt;
</code></pre>
<h3>Making Sure HBase is Working</h3>
<p>If you do a ps ax | grep hbase you should see two java processes. One should end with:<br />
<code>org.apache.hadoop.hbase.zookeeper.HQuorumPeer start</code><br />
And the other should end with:<br />
<code>org.apache.hadoop.hbase.master.HMaster start</code><br />
Since we are running in the Pseudo-Distributed mode, there will not be any explicit regionservers running. If you have problems, check the logs in ~/work/hbase/logs/*.{out,log}</p>
<h3>Testing HBase using the HBase Shell</h3>
<p>From the unix prompt give the following command:</p>
<pre>~/work/hbase/bin/hbase shell</pre>
<p>Here is some example commands from the Apache HBase Installation Instructions:</p>
<pre>base&gt; # Type "help" to see shell help screen
hbase&gt; help
hbase&gt; # To create a table named "mylittletable" with a column family of "mylittlecolumnfamily", type
hbase&gt; create "mylittletable", "mylittlecolumnfamily"
hbase&gt; # To see the schema for you just created "mylittletable" table and its single "mylittlecolumnfamily", type
hbase&gt; describe "mylittletable"
hbase&gt; # To add a row whose id is "myrow", to the column "mylittlecolumnfamily:x" with a value of 'v', do
hbase&gt; put "mylittletable", "myrow", "mylittlecolumnfamily:x", "v"
hbase&gt; # To get the cell just added, do
hbase&gt; get "mylittletable", "myrow"
hbase&gt; # To scan you new table, do
hbase&gt; scan "mylittletable"</pre>
<p>You can stop hbase with the command:</p>
<pre>~/work/hbase/bin/stop-hbase.sh</pre>
<p>Once that has stopped you can stop hadoop:</p>
<pre>~/work/hadoop/bin/stop-all.sh</pre>
<h2>Conclusion</h2>
<p>You should now have a fully working Pseudo-Distributed Hadoop / HBase setup on your Mac. This is not suitable for any kind of large data or production project. In fact it will probably fail if you try to do anything with lots of data or high volumes of I/O. HBase seems to not like to work well until you get 4 &#8211; 5 regionservers.</p>
<p>But this Pseudo-Distributed version should be fine for doing experiments with tools and small data sets.</p>
<p>Now I can get on with playing with <a href="http://github.com/clj-sys/cascading-clojure" target="_blank">Cascading-Clojure</a> and <a href="http://nathanmarz.com/blog/introducing-cascalog/" target="_blank">Cascalog</a>!</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://blog.ibd.com/scalable-deployment/hbase-hadoop-on-mac-ox-x/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Want to work at a Startup with Cool Tech? (HBase, Clojure, Chef, Swarms, Javascript, Ruby &amp; Rails)</title>
		<link>http://blog.ibd.com/scalable-deployment/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/</link>
		<comments>http://blog.ibd.com/scalable-deployment/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 18:15:01 +0000</pubDate>
		<dc:creator>Robert J Berger</dc:creator>
				<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Opscode Chef]]></category>
		<category><![CDATA[Ruby / Rails]]></category>
		<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[rabbitmq]]></category>
		<category><![CDATA[tweekts]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://blog.ibd.com/?p=253</guid>
		<description><![CDATA[Opportunity Knocks
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa.com, the startup where I am CTO, is looking for great developers to join our small agile team. We&#8217;re an early stage, pre-series-A startup (presently funded with strategic investments from two large corporations). Runa offers [...]]]></description>
			<content:encoded><![CDATA[<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Opportunity Knocks</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa.com, the startup where I am CTO, is looking for great developers to join our small agile team. We&#8217;re an early stage, pre-series-A startup (presently funded with strategic investments from two large corporations). Runa offers a SaaS to on-line merchant that allows them to offer dynamic product and consumer specific promotions embeded in their website. This will be a very large positive disruption to the online retailing world.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><span style="text-decoration: underline;">Techie keywords:</span> <strong>clojure, hadoop, hbase, rabbitmq, erlang, chef, swarm computing, ruby, rails, javascript, amazon EC2, emacs, Macintosh, Linux, selenium, test/behavior driven development, agile, lean, XP, scalability</strong></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">If you&#8217;re interested, email  <a href="mailto:jobs@runa.com">jobs@runa.com</a></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">If you want to know more, read on!</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>What do we do</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa aims to provide the top of the long tail thru the middle of the top 500 online retailers with tools/services that companies like amazon.com use/provide. These smaller guys can&#8217;t afford or don&#8217;t have the resources to do anything on that scale, but by using our SaaS services, they can make more money while providing customers with greater value.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">The first service we&#8217;re building is what we call Dynamic Sale Price.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">It&#8217;s a simple concept &#8211; it allows the online-retailer to offer a sale price for each product on his site, personalized to the individual consumer who is browsing it. By using this service, merchants are able to -</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<ul>
<li>Increase conversion (get them to buy!) and</li>
<li>Offer consumers a special price which maximizes the merchant&#8217;s profit</li>
</ul>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">This is different from &#8220;dumb-discounting&#8221; where something is marked-down, and everyone sees the same price. This service is more like airline or hotel pricing which varies from day to day, but much more dynamic and real-time. Further, it is based on broad statistical factors AND individual consumer behavior. After all, if you lower prices enough, consumers will buy. Instead, we dynamically lower prices to a point where statistically, that consumer is most likely to buy.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>How we do it</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa does this by performing statistical analysis and pattern recognition of what consumers are doing on the merchant sites. This includes browsing products on various pages, adding and removing items from carts, and purchasing or abandoning the carts. We track consumers as they browse, and collect vast quantities of this click-stream data. By mining this data and applying algorithms to determine a price point per consumer based on their behavior, we&#8217;re able to  maximize both conversion (getting the consumer to buy) AND merchant profit.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We also offer the merchant comprehensive reports based on analysis of the mountains of data we collect. Since the data tracks consumer activity down to the individual product SKU level (for each individual consumer), we can provide very rich analytics.  This is a tool that merchants need today, but don&#8217;t have the resources to build for themselves.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>The business model</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">For reference, it is useful to understand the affiliate marketing space. Small-to-medium merchants (our target audience) pay affiliates up to 40% of a sale price. Yes, 40%. The average is in the 20% range.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We charge our merchants around 10% of sales the Runa delivers. Our merchants are happy to pay it, because it is a performance-based pay, lower than what they pay affiliates, and there is zero up-front cost to the service. In fact, the above mentioned analytics reports are free.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re targeting e-commerce PLATFORMS (as opposed to individual merchants); in this way, we&#8217;re able to scale up merchant-acquisition. We have 10 early-customer merchants right now, with about 100 more planned to go live in the next 2-3 months. By the end of next year, we&#8217;re targeting about 1,000 merchants and 10,000 merchants the following year. Our channel deployment model makes these goals achievable.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">At something like a 5 to 10% service charge, and a typical merchant having between 500K to 1M in sales per year, this is a VERY profitable business model. That is, of course, if we&#8217;re successful&#8230; but we&#8217;re seeing very positive signs so far.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Technology</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Most of our front-end stuff (like the merchant-dashboard, reports, campaign management) is built with Ruby on Rails. Our merchant integration requires browser-side Javascript magic. All our analytics (batch-processing) and real-time pricing services are written in Clojure. We use RabbitMQ for all our messaging needs. We store data in HBase. We&#8217;re deployed on Amazon&#8217;s EC2.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Here are a few blog postings about what we&#8217;ve been up to -</p>
<p><a href="http://s-expressions.com/2009/05/02/startup-logbook-distributed-clojure-system-in-production-v02/" target="_blank">Distributed Clojure system in production</a><br />
<a href="http://s-expressions.com/2009/04/12/using-messaging-for-scalability/" target="_blank">Using messaging for scalability</a><br />
<a href="http://s-expressions.com/2009/03/31/capjure-a-simple-hbase-persistence-layer/" target="_blank">Capjure: a simple HBase persistence layer</a><br />
<a href="http://s-expressions.com/2009/01/28/startup-logbook-clojure-in-production-release-v01/" target="_blank">Clojure in production<br />
</a><span style="color: #0000ee; "><span style="text-decoration: underline;"><a href="http://blog.ibd.com/scalable-deployment/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/" target="_blank">Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</a></span></span></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;ve also open-sourced a few of our projects -</p>
<p><a href="http://github.com/amitrathore/swarmiji/tree/master" target="_blank">swarmiji</a> &#8211; A distributed computing system to write and run Clojure code in parallel, across CPUs<br />
<a href="http://github.com/amitrathore/capjure/tree/master" target="_blank">capjure</a> &#8211; Clojure persistence for HBase</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Culture at Runa</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re a small team, very passionate about what we do. We&#8217;re focused on delivering a ground-breaking, disruptive service that will allow merchants to really change the way they sell online. We work start-up hours, but we&#8217;re flexible and laid-back about it. We know that a healthy personal life is important for a good professional life. We work with each other to support it.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We use an agile process with a lot of influences from the &#8220;Lean&#8221;:http://en.wikipedia.org/wiki/Lean_software_development and &#8220;Kanban&#8221;:http://leansoftwareengineering.com/2007/08/29/kanban-systems-for-software-development/ world. We use &#8220;Mingle&#8221;:http://studios.thoughtworks.com/mingle-agile-project-management to run our development process. Everything, OK mostly everything <img src='http://blog.ibd.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  is covered by automated tests, so we can change things as needed.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re all Apple in the office &#8211; developers get a MacPro with a nice 30&#8243; screen, and a nice 17&#8243; MacBook Pro.  We deploy on Ubuntu servers.  Aeron chairs are cliché, yes; but, very comfy.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">The environment is chilled out&#8230; you can wear shorts and sandals to work&#8230;  Very flat organization, very non-bureaucratic&#8230; nice open spaces (no cubes!). Lunch is brought in on most days! Beer and snacks are always in the fridge.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re walking distance to the San Antonio Caltrain station (biking distance from the Mountain View Caltrain/VTA lightrail station).</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>What&#8217;s in it for you</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<ul>
<li>Competitive salaries, and lots of stock-options</li>
<li>Cutting edge technology stack</li>
<li>Fantastic business opportunity, and early-stage (= great time to join!)</li>
<li>Developer #5 &#8211; means plenty of influence on foundational architecture and design</li>
<li>Smart, full bandwidth, fun people to work with</li>
<li>Very comfortable, nice office environment</li>
<li>We have a &#8220;No Assholes&#8221; policy</li>
</ul>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>OK!</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">So, if you&#8217;re interested, email us at <a href="mailto:jobs@runa.com">jobs@runa.com</a></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">No recruiters please!</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We would prefer folks who are already in the Bay Area (but if you not local and are really great let&#8217;s talk!)</p>
<div><span style="font-family: verdana, arial, helvetica, clean, sans-serif; font-size: small;"><span style="line-height: 14px; white-space: pre-wrap; "><br />
</span></span></div>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://blog.ibd.com/scalable-deployment/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOWTO: Install iClassify on Ubuntu and Mac OS X Leopard</title>
		<link>http://blog.ibd.com/scalable-deployment/howto-install-iclassify-on-ubuntu-and-mac-os-x-leopard/</link>
		<comments>http://blog.ibd.com/scalable-deployment/howto-install-iclassify-on-ubuntu-and-mac-os-x-leopard/#comments</comments>
		<pubDate>Fri, 09 Jan 2009 06:35:23 +0000</pubDate>
		<dc:creator>Robert J Berger</dc:creator>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[iClassify]]></category>
		<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Opscode Chef]]></category>
		<category><![CDATA[Puppet]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://blog.ibd.com/?p=105</guid>
		<description><![CDATA[Update:
<p>The folks who developed iClassify have come out with a total framework that is an alternative to iClassify/Puppet called opscode-chef. I will be looking into that soon and not do anymore work with iClassify (unless Chef turns out to suck or something, but at first glance it looks pretty good!).</p>
<p>[After playing with Chef, my conclusion is [...]]]></description>
			<content:encoded><![CDATA[<h2>Update:</h2>
<p>The folks who developed iClassify have come out with a total framework that is an alternative to iClassify/Puppet called <a href="http://wiki.opscode.com/display/chef/Home" target="_blank">opscode-chef</a>. I will be looking into that soon and not do anymore work with iClassify (unless Chef turns out to suck or something, but at first glance it looks pretty good!).</p>
<p>[After playing with Chef, my conclusion is forget about iClassify / Puppet and just use Chef unless you are already using Puppet]</p>
<h2>iClassify Description</h2>
<p>From the creator&#8217;s of<a href="https://wiki.hjksolutions.com/display/IC/Home" target="_blank"> iClassify HJK Solutions website:</a></p>
<blockquote><p>iClassify allows for the easy registration and classification of nodes. Most of the time, a node is a server. With iClassify:</p>
<ul>
<li>Nodes register themselves with a central web service, including reporting <span class="nobr"><a rel="nofollow" href="http://reductivelabs.com/projects/facter">Facter<sup><img class="rendericon" src="https://wiki.hjksolutions.com/images/icons/linkext7.gif" border="0" alt="" width="7" height="7" align="absmiddle" /></sup></a></span> facts.</li>
<li>You can then tag those nodes, and add manual attributes.</li>
<li>You can search the nodes with a full text search engine</li>
<li>You can write recipies for icagent to auto-classify and auto-attribute your nodes.</li>
<li>You can tie it in to <span class="nobr"><a rel="nofollow" href="http://reductivelabs.com/projects">Puppet<sup><img class="rendericon" src="https://wiki.hjksolutions.com/images/icons/linkext7.gif" border="0" alt="" width="7" height="7" align="absmiddle" /></sup></a></span> as an external node classification tool, enabling you to easily configure hundreds of nodes at a time.</li>
<li>You can <a title="Capistrano Task" href="https://wiki.hjksolutions.com/display/IC/Capistrano+Task">tie it in to Capistrano, and have a dynamic ad-hoc configuration tool</a>.</li>
</ul>
</blockquote>
<p>We are considering using it along with Puppet and Amazon EC2 for deployment of some of our infrastructure.</p>
<h2>Install iClassify</h2>
<pre><span style="font-family: 'Lucida Grande'; line-height: 19px; white-space: normal;"><a href="https://wiki.hjksolutions.com/display/IC/Install+Instructions">Original Instructions</a> are at HJK Solutions. Some of the following quotes liberally from that site but adds the things I learnt along the way as well as how to do it on a Mac.</span></pre>
<h3>Prerequisites</h3>
<h4>Ruby Gems</h4>
<ul>
<li>Rails 2.0.2</li>
<li>Rake</li>
<li>Builder</li>
<li>UUID Tools</li>
<li>Mongrel</li>
<li>Highline</li>
<li>Net-LDAP</li>
</ul>
<pre>sudo gem rails rake install builder uuidtools mongrel ruby-net-ldap</pre>
<h4>Non-Gems</h4>
<ul>
<li>MySQL</li>
<li>Factor</li>
<li>Git</li>
<li>Java</li>
<li>Runit</li>
<li>Mongrel Runit</li>
</ul>
<h3>Configure MySQL</h3>
<p>The folks at HJK say they use MySQL and it should work with PostgreSQL and sqlite3 as well. We went wtih MySQL.<br />
First create the database <em>iclassify_production</em></p>
<pre>mysqladmin -u root -p  create iclassify_production</pre>
<p>Grant it the correct privleges (set <em>yourpass</em> to the password you want to use):</p>
<pre>mysql -u root -p iclassify_production
mysql&gt; GRANT ALL ON iclassify_production.* TO 'iclassify'@'localhost' IDENTIFIED BY 'yourpass';</pre>
<p>You should have the ruby mysql gem installed</p>
<pre>sudo gem install mysql</pre>
<p>on the Macintosh OS X Leopard I had to say:</p>
<pre>sudo env ARCHFLAGS="-arch i386" gem install mysql -- --with-mysql-config=/usr/local/mysql/bin/mysql_config</pre>
<h3>Git</h3>
<p>You can install from ubuntu packages</p>
<pre>sudo apt-get install git-core</pre>
<p>On the Mac Download and install the git OS X package for the <a href="http://code.google.com/p/git-osx-installer/">Mac from Git OSX Installer on Google Code </a>or</p>
<pre>port -uR install git-core</pre>
<h3>Java</h3>
<p>Java is needed for the <a href="http://lucene.apache.org/solr" target="_blank">Solr</a> package that is bundled in the iClassify distro. It can be installed via apt and/or downloaded from the Sun site. Java is already installed on Mac OS X.</p>
<h3>Runit</h3>
<p>Runit is an alternative / addition to the standard /etc/init.d &#8220;systemV&#8221; init system. I guess the HJK folks like it and seem to have dependencies on it. They say it should work without it, but I haven&#8217;t tried putting this together without the Runit/Mongrel_runit dependencies yet.</p>
<p>Some good info and tips on setting up / running runit on various systems can be found at <a href="http://smarden.org/runit/" target="_blank">runit &#8211; a UNIX init scheme with service supervision</a></p>
<p>Runit can be safely installed with apt. It will not replace the standard init system.:</p>
<pre>sudo apt-get install runit</pre>
<p>For the Mac:</p>
<pre>port install runit</pre>
<p>On the Mac, you&#8217;ll have to start the runit system with the command</p>
<div>
<pre>sudo launchctl load -w /Library/LaunchDaemons/org.macports.runit.plist</pre>
</div>
<h3>iClassify itself</h3>
<p>In a directory you want to keep the source of iclassify:</p>
<pre>git clone git://git.hjksolutions.com/iclassify iclassify
cd iclassify</pre>
<p>You need to know where you want to install the actual working rails app of iclassify and what user id/group you want to run it under.</p>
<p>The default location and the one we&#8217;ll use on Ubuntu is <em>/srv/iclassify</em> and the user id/group is usually the same as the one that runs the apache web services (<em>www-data</em>). Change <em>yourpass</em> to the password used for the iclassify user in MySQL.</p>
<pre>sudo rake iclassify:install ICBASE=/srv/iclassify ICUSER=www-data ICGROUP=www-data DBUSER=iclassify DBPASS=yourpass</pre>
<p>For the Macintosh:</p>
<pre>sudo rake iclassify:install ICBASE=/usr/local/iclassify ICUSER=_www ICGROUP=_www DBUSER=iclassify DBPASS=yourpass</pre>
<p>This will create a new iClassify instance in <em>/srv/iclassify</em>, set the right ownership to run iClassify, and set run the migrations to prepare your database instance. I found that I had to run this as root so that it will create the directories properly.</p>
<p>You can test that the iClassify rails app was installed properly by testing it with the built in Rails Server (on the Mac use the <em>/usr/local/iclassify</em> directory and <em>_www</em> user id):</p>
<div class="preformatted panel">
<div class="preformattedContent panelContent">
<pre>$ cd /srv/iclassify
$ sudo -u www-data env RAILS_ENV=production ./script/server
=&gt; Booting Mongrel (use 'script/server webrick' to force WEBrick)
=&gt; Rails application starting on http://0.0.0.0:3000
=&gt; Call with -d to detach
=&gt; Ctrl-C to shutdown server
** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with production environment...
** Rails loaded.
** Loading any Rails specific GemPlugins
** Signals ready.  TERM =&gt; stop.  USR2 =&gt; restart.  INT =&gt; stop (no restart).
** Rails signals registered.  HUP =&gt; reload (without restart).  It might not work well.
** Mongrel available at 0.0.0.0:3000
** Use CTRL-C to stop.</pre>
</div>
</div>
<p>You can now point your browser to the local instance of iClassify at <span class="nobr"><a rel="nofollow" href="http://localhost:3000/">http://localhost:3000</a>, hit CTRL-C when you are done to terminate the script/server.</span></p>
<h3>Mongrel Runit</h3>
<p>You can download the mongrel_runit gem from the <a href="https://wiki.hjksolutions.com/display/MR/Home" target="_blank">Mongrel Runit page at HJK</a></p>
<p>Then install it with (on the Mac you&#8217;ll have to use the /usr/local/iclassify directory instead of /src/iclassify):</p>
<p><span style="font-family: 'Courier New'; line-height: 18px; white-space: pre;">F</span>or some reason the HJK folks set their <em>runit_service_dir</em> to be <em>/var/service</em> but the runit ubuntu package puts it in<em> /etc/service</em>. So you might want to edit <em>/srv/iclassify/examples/mongrel_runit_iclassify.yml</em> and set <em>runit_service_dir </em>to <em>/etc/service.</em> You can also change the number of mongrels you want to run in that file. Similarly, the DarwinPorts port install of runit expects it to be in <em>/opt/local/var/service</em>. You can change the <em>/srv/iclassify/examples/mongrel_runit_iclassify.yml</em> or you can make a symbolic link from <em>/opt/local/var/service</em> to <em>/var/service</em></p>
<pre>sudo gem install mongrel_runit-0.2.1.gem
sudo mkdir /etc/mongrel_runit
sudo cp /srv/iclassify/examples/mongrel_runit_iclassify.yml /etc/mongrel_runit/iclassify.yml
sudo mongrel_runit -c /etc/mongrel_runit/iclassify.yml create</pre>
<p>You should be able to then run the command</p>
<pre>mongrel_runit -v status -c /etc/mongrel_runit/iclassify.yml</pre>
<p>and see something like (there should be as many lines as you have set for mongrel servers. I changed the iclassify.yml from 5 to 3):</p>
<pre>5000: true: run: /etc/sv/mongrel-iclassify-5000: (pid 4403) 119s; run: log: (pid 4402) 119s
5001: true: run: /etc/sv/mongrel-iclassify-5001: (pid 4401) 119s; run: log: (pid 4400) 119s
5002: true: run: /etc/sv/mongrel-iclassify-5002: (pid 4399) 119s; run: log: (pid 4398) 119s</pre>
<h3>Solr</h3>
<p>First create some directories that will be needed for Solr to run its index as the www user (on the Mac replace /srv with /usr/local).</p>
<pre>sudo mkdir -p /srv/iclassify/vendor/plugins/acts_as_solr/solr/solr/data/production
sudo chown -R www-data:www-data /srv/iclassify/vendor/plugins/acts_as_solr/solr/solr/data/</pre>
<p>You can then test it with the command:</p>
<pre>sudo -u www-data env RAILS_ENV=production rake solr:start</pre>
<p>It should start and run with no errors.</p>
<p>You can stop it if you want with:</p>
<pre>sudo -u www-data env RAILS_ENV=production rake solr:stop</pre>
<p>Then set up runit to run it automatically (use /usr/local instead of /srv on the Mac and /var/service or whatevever your system uses for runit service dir if not on ubuntu and using /etc/service):</p>
<pre>sudo mkdir -p /etc/sv/iclassify-solr/log/main
sudo cp /srv/iclassify/examples/solr-run /etc/sv/iclassify-solr/run
sudo cp /srv/iclassify/examples/solr-log /etc/sv/iclassify-solr/log/run
sudo chmod a+x /etc/sv/iclassify-solr/run /etc/sv/iclassify-solr/log/run</pre>
<p>The following will start the solr process immediately as well as in the future reboots</p>
<pre>sudo ln -s /etc/sv/iclassify-solr /etc/service</pre>
<h3>Apache</h3>
<p>iClassify is best configured as a virtual host under Apache, running with SSL and mod_proxy_balancer. Follow the proper steps for configuring your platforms Apache to use mod_ssl,  mod_proxy_balance r and mod_rewrite. Create a virtual host config which resembles the following (works with Mac and Ubuntu, On the Mac just change the refs to /srv to /usr/local and the EXAMPLE.com to your domain in any case)</p>
<div class="preformatted panel">
<div class="preformattedContent panelContent">
<pre>&lt;VirtualHost *:443&gt;
  DocumentRoot /srv/iclassify/public
  LimitRequestBody 8388608
  ServerName iclassify.EXAMPLE.COM
  ServerAlias iclassify
  &lt;Directory /&gt;
    Options FollowSymLinks
    AllowOverride None
  &lt;/Directory&gt;

  &lt;Location /server-status&gt;
    SetHandler server-status
    Order Deny,Allow
    Deny from all
    Allow from 127.0.0.1 192.168.0.0/255.255.0.0
  &lt;/Location&gt;

  &lt;Proxy balancer://iclassify&gt;
    BalancerMember http://localhost:5000
    BalancerMember http://localhost:5001
    BalancerMember http://localhost:5002
    BalancerMember http://localhost:5003
  &lt;/Proxy&gt;

  LogLevel info
  ErrorLog /var/log/apache2/iclassify-error.log
  CustomLog /var/log/apache2/iclassify-access.log combined

  RewriteEngine On
  RewriteLog /var/log/apache2/iclassify-rewrite.log
  RewriteLogLevel 0
  RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
  RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  RewriteCond %{SCRIPT_FILENAME} !maintenance.html
  RewriteRule ^.*$ /system/maintenance.html [L]
  RewriteRule ^/server-status$ /server-status$1 [L]

  RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
  RequestHeader set X_FORWARDED_PROTO 'https'
  RewriteRule ^/(.*)$ balancer://iclassify%{REQUEST_URI} [P,QSA,L]
  ProxyPassReverse / balancer://iclassify
  SetEnv proxy-nokeepalive 1

  SSLEngine on
  SSLCertificateFile /etc/apache2/ssl/iclassify.crt
  SSLCertificateKeyFile /etc/apache2/ssl/iclassify.key

  BrowserMatch ".*MSIE.*" \
    nokeepalive ssl-unclean-shutdown \
    downgrade-1.0 force-response-1.0
&lt;/VirtualHost&gt;

&lt;VirtualHost *:80&gt;
  DocumentRoot /srv/iclassify/public
  LimitRequestBody 8388608
  ServerName iclassify.EXAMPLE.COM
  ServerAlias iclassify

  RewriteEngine On
  RewriteCond %{HTTPS} !=on
  RewriteRule ^/(.*) https://%{SERVER_NAME}/ [R,L]
&lt;/VirtualHost&gt;</pre>
</div>
</div>
<p>Basic instructions for setting up ssl certificates can be found at Just Samuels blog post <a href="http://www.justinsamuel.com/2006/03/11/howto-create-a-self-signed-wildcard-ssl-certificate/" target="_blank"><span style="color: #000000; text-decoration: none;">HOWTO: Create a self-signed (wildcard) SSL certificate</span></a></p>
<pre><a rel="nofollow" href="http://www.justinsamuel.com/2006/03/11/howto-create-a-self-signed-wildcard-ssl-certificate"></a></pre>
<p>Take the resulting hosts.cert and copy it to /etc/apache2/ssl/iclassify.crt and hosts.key to /etc/apache2/ssl/iclassify.key (or whereever you put your ssl keys and make sure the SSLCertificateKeyFile and SSLCertificateFile are set the same in your vhosts conf file.</p>
<h3>Conclusion</h3>
<p>That should get you up and running with iClassify. In a future post I will install Puppet and then figure out how to use these together to deploy to Amazon EC2.</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://blog.ibd.com/scalable-deployment/howto-install-iclassify-on-ubuntu-and-mac-os-x-leopard/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
