Tuesday, April 28, 2015

Setting up boot2docker and working with live-reload directory the easy way



Checkout https://github.com/boot2docker/boot2docker/pull/534

Download latest boot2docker from:
https://github.com/boot2docker/windows-installer/releases

If you access the boot2docker VM via the following:
$boot2docker ssh

You will not be able to have access to your host developement files. To make your
developement files accessible do the following in Windows:

$boot2docker stop
$VBoxManage.exe sharedfolder add boot2docker-vm --name Users --hostpath C:/Users --automount.



Note 1: if you are using cygwin you have to use the syntax C:/Users, not C:\Users
Note 2:
hostpath Users` share at `/Users` in the boot2docker VM
hostpath /Users` share at `/Users`  in the boot2docker VM
hostpath c/Users` share at `/c/Users`  in the boot2docker VM
hostpath /c/Users` share at `/c/Users`  in the boot2docker VM
hostpath c:/Users` share at `/c/Users`  in the boot2docker VM


$boot2docker up
$boot2docker ssh "ls /Users"

Voila!  You will see your files inside boot2docker !

But how about using another path than Users above ?
Do as follows:


$VBoxManage.exe sharedfolder add boot2docker-vm --name Afrepo --hostpath F:/gitrepositories2 --automount

$ boot2docker ssh "ls /"
You will not find your Afrepo in the list, right? Try the following:
$ boot2docker ssh "mkdir /home/docker/Afrepo && sudo mount -t vboxsf -o uid=1000,gid=50 Afrepo /home/docker/Afrepo"


You files at F:/gitrepositories2 - in your host will be accessible inside boot2docker VM and finally to your docker !

Try the following:
$ boot2docker ssh
$docker run -v /Users/...:...   if you have used Users as sharename or
$docker run -v /Afrepo/...: ...  if you have used Afrepo as sharename




Tuesday, March 10, 2015

Setting up Docker and Boot2docker and shared folder MAc or Windows

On Mac:
$brew install docker
$brew install boot2docker
$ boot2docker stop
$ VBoxManage sharedfolder add boot2docker-vm -name home -hostpath $HOME
$ boot2docker up
$ boot2docker ssh "sudo modprobe vboxsf && mkdir -p $HOME && sudo mount -t vboxsf home $HOME"
You can now run docker -v transparently as long as the volume is inside $HOME
More tips:http://www.incrediblemolk.com/sharing-a-windows-folder-with-the-boot2docker-vm/
https://github.com/boot2docker/boot2docker/pull/284



Wednesday, December 31, 2014

vagrant/machine.rb:153:in `action': wrong number of arguments (2 for 1) (ArgumentError)


Change the line from :

                  def action(name, **opts)

to:
                  def action(name, opts)

in the file located at /opt/vagrant/embedded/gems/gems/vagrant-1.7.1/lib/vagrant/machine.rb line 153.
This will be fixed in later versions of Vagrant after 1.7.1

Tuesday, December 30, 2014

Vagrant: Failed to mount folders in Linux guest. This is usually because the "vboxsf" file system is not available.



This is an issue of older version of VirtualBox. Download the latest virtual box from :
Virtual box Download site

You may need to do the following also if not resolved after upgrading to latest version:
$ vagrant plugin install vagrant-vbguest
More tips to solve the problems:
https://github.com/mitchellh/vagrant/issues/3341
https://www.virtualbox.org/manual/ch04.html#idp54932560

Tuesday, October 14, 2014

Caused by: org.apache.axis2.AxisFault: Address information does not exist in the Endpoint Reference (EPR).The system cannot infer the transport mechanism.



You may get the error as in my case :

<property name="wsdlFilename" value="MyServiceWS20Service.wsdl" />


The axis can not find wsdl somewhere as resource or misspelling.

Friday, August 1, 2014

Get http:///var/run/docker.sock/v1.12/info: dial unix /var/run/docker.sock: no such file or directory


If you are like me running on MAC OSX and you get this error because you have not started the docker and exported the DOCKER_HOST.
$ docker info
2014/08/01 17:34:31 Get http:///var/run/docker.sock/v1.12/info: dial unix /var/run/docker.sock: no such file or directory


Do as following to fix it:
$boot2docker

You will receive something like the following :
2014/08/01 17:34:21 Started.
2014/08/01 17:34:21 To connect the Docker client to the Docker daemon, please set:

2014/08/01 17:34:21     export DOCKER_HOST=tcp://134.155.10.100:2375

Export DOCKER_HOST in your command line:
$export DOCKER_HOST=tcp://134.155.10.100:2375

Check if you d not get the error again or DOCKER is working properly:


$ docker info
Containers: 0
Images: 0
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Dirs: 0
Execution Driver: native-0.2
Kernel Version: 3.15.3-tinycore64
Debug mode (server): true
Debug mode (client): false
Fds: 10
Goroutines: 10
EventsListeners: 0
Init Path: /usr/local/bin/docker


More good tips about docker:
https://gist.github.com/wsargent/7049221


Friday, July 11, 2014

Naive Bayes classifier using Mahout



Bayes was a Presbyterian priest who published "Tractatus Logicus" in 1795. Not far after his dead did anyone appreciate the value of his great work a century later in the scientific community. This is a study of Boolean Calculus (conditional probability).

This classifier is used in supervised learning data-mining.

In order to use Naive Bayes Classifier to classify the dataset and train it you need to use Hadoop to convert the data-set to Hadoop sequence files. Hadoop take the files as input files and generate one chunk-file.The Hadoop command below illustrate the use of Mahout to generate these files:

$./mahout seqdirectory -i ${WORK_DIR}/input_files -o ${WORK_DIR}/new_sequencefiles

This command take as input every files in the directory /input_files and transform them into a sequence  file.
Check the help command of mahout to find out more about the command:
$./mahout seqdirectory --help


The following Hadoop command examine the outcome of the sequence file:

$hadoop fs -text  ${WORK_DIR}/new_sequencefiles | more

A number of machine-learning and data-mining algorithms are based on the calculation of vectors that must be provided.

The naive Bayes algorithm does not work with words and raw-data, but works with weighted vectors associated with the documents. To transform the raw data or text to weighted vectors, the mahout-command provide an convenience way as follows:
./mahout seq2sparse -i ${WORK_DIR}/new_sequencefiles -o ${WORK_DIR}/new_vectorfiles
-lnorm -nv -wt tfidf

Refer to the helper manual for further understanding of -lnorm -nv -wt tfidf like L_2 norm, namedVector.

We now need to train the algorithm, The best approach is to split the vectors into 20-80 which the 20% of the vectors are preserved to test and the 80% is used to train the algorithm.
mahout provide a command line approach to split the weighted vectors:

$./mahout split
-i ${WORK_DIR}/new_vectorfiles/tfidf-vectors
--trainingOutput ${WORK_DIR}/new-train-vectors
--testOutput ${WORK_DIR}/new-test-vectors
--randomSelectionPct 40 --overwrite --sequenceFiles -xm sequential

After this command runned we will have two directories for training and test vectors.
We train first the Naive Bayes algorithm with the vectors from ${WORK_DIR}/new-train-vectors as follows:




Sqoop is an Apache software that is used to acquire data from RDBMS and import data into HDFS to prepare it to Mahout Analysis.


./mahout trainnb
-i ${WORK_DIR}/new-train-vectors -el
-o ${WORK_DIR}/model
-li ${WORK_DIR}/labelindex
-ow

This will generate a MODEL in form of binary file. This represents the weight matrix, the feature and label sums.

We test the algorithm against 20% of the initial input vectors:
./mahout testnb
-i ${WORK_DIR}/new-test-vectors
-m ${WORK_DIR}/model
-l ${WORK_DIR}/labelindex\
-ow -o ${WORK_DIR}/new-testing

After this command you will see a result of accuracy of the training. A result of accuracy at least 80% should be enough.

And use the following command to dump to a text file:
mahout vectordump -i ${WORK_DIR}/new-vectors/tfidf-vectors/
part-r-00000 –o
${WORK_DIR}/new-vectors/tfidf-vectors/part-r-00000dump

Checkout more on :
https://mahout.apache.org/users/classification/twenty-newsgroups.html