These are the basic, scripted directions. I am not an expert of nor even a normal user of Hadoop, so I don't have any insight past the knowledge that this scripted example works. If you are going to use HOD regularly, you will need to become an HOD and Hadoop expert. I am sure that there are many many options and configuration possibilities that will help you!
1. In your home directory, make a ".hod" directory
> mkdir .hod
2. Copy ~act/.hod/hodrc into your .hod directory
> cp ~act/.hod/hodrc .hod/
3. Edit the hodrc file and change "act" to your username everywhere
4. Edit your .bashrc or .cshrc (whichever shell you are using) and add the directory /usr/share/hadoop/contrib/hod/bin/ to your PATH. E.g., add this line to the end of your .bashrc (.cshrc uses different syntax!)
export PATH=$PATH:/usr/share/hadoop/contrib/hod/bin/
Source your .bashrc to make the change current.
5. Make a directory in which you want Hadoop to manage its instance; you can call it anything you want and put it anywhere you want
> mkdir hadoop
6. Use the "hod" command to create and start up a Hadoop instance, such as
> hod allocate -d ~/hadoop/ -n 4
This uses Torque to allocate the number of nodes requested (4 in this case), starts a Hadoop instance with a new HDFS filesystem (but you can probably tell it to use some existing predefined filesystem with data in it -- this would be in the hodrc file, plus some other parameters that I don't know). You can see that there is a Torque job running by using the "qstat" command.
7. Use your Hadoop instance, always referencing your configuration directory using "--config <yourdirectory>". For example, we did this:
See that it works and list available commands:
> hadoop --config ~/hadoop
List root of our HDFS filesystem:
> hadoop --config ~/hadoop fs -ls /
Put a file into HDFS as the name test.dat:
> hadoop --config ~/hadoop fs -put ./hod-jcook.log /test.dat
See that it is there:
> hadoop --config ~/hadoop fs -ls /
Run the sample wordcount Hadoop program on our file:
> hadoop --config ~/hadoop jar /usr/share/hadoop/hadoop-examples-1.2.1.jar wordcount /test.dat /testout
See that the result directory "testout" was created by the wordcount program:
> hadoop --config ~/hadoop fs -ls /
Look at what is in it:
> hadoop --config ~/hadoop fs -ls /testout
Look at one of the output files that contains the wordcount:
> hadoop --config ~/hadoop fs -cat /testout/part-r-00000
8. When you are done with your Hadoop instance, deallocate it:
> hod deallocate -d ~/hadoop
If you want to somehow keep your HDFS filesystem, you may need some options, I just don't know. You will need to read up on HOD and Hadoop.
9. Use the "qstat" command to see if your Torque job for the HOD instance has finished -- IF IT SHOWS UP it is still running! I would wait for a little bit, but if it does not finish on its own YOU MUST DELETE IT MANUALLY. Our evidence shows that HOD sometimes leaves the job running.
> qstat > qdel 142 /** 142 is an example job number **/