Monday, December 19, 2011

Document Management - Sphinx

"Sphinx is a tool that makes it easy to create intelligent and beautiful documentation, written by Georg Brandl and licensed under the BSD license.", says the official web site of Sphinx.

In my opinion, Sphinx is a tool to generate document by coding the contents, especial more focused on writing manuals or specifications, as not only coding document but creating diagram images such as actdiag, blockdiag, nwdiag, and seqdiag.



Now, I'm going to list the installation process of Sphinx.

The platform is the same with before when installing Alfresco.
OS                   CentOS release 5.7 (Final)        
Kernel         2.6.18-274.el5 x86_64
Package group   Development Libraries
 Dialup Networking Support
 GNOME Software Development
 Legacy Software Development
 Legacy Software Support
 Mail Server
 Network Servers
 System Tools
 Yum Utilities
Python python26-2.6.5-6.el5
python26-devel-2.6.5-6.el5
python26-distribute-0.6.10-4.el5
python-imaging-1.1.7-4.el5
python-imaging-devel-1.1.7-4.el5
python26-libs-2.6.5-6.el5
python-setuptools-0.6

The extended modules for sphinx and their roles are below.
sphinx Sphinx core package
rst2pdf Generate PDF
blockdiag Generate block-diagram image
sphinxcontrib-blockdiag Embed block-diagram
seqdiag Generate sequence-diagram image
sphinxcontrib-seqdiag Embed sequence-diagram image
actdiag Generate activity-diagram image
sphinxcontrib-actdiag Embed  activity-diagram image
nwdiag Generate network-diagram image
sphinxcontrib-nwdiag Embed network-diagram image


  • Install epel repository
$ wget http://download.fedora.redhat.com/pub//5/x86_64/epel-release-5-4.noarch.rpm
$ sudo rpm -ivh -release-5-4.noarch.rpm
$ sudo sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/epel.repo
  •  Install python-2.6
$ sudo yum -y --enablerepo=epel install python26 python26-libs python26-devel python-imaging python-imaging-devel python-setuptools
  •   Install sphinx and the extended modules
$ sudo easy_install-2.6 sphinx
$ sudo easy_install-2.6 rst2pdf
$ sudo easy_install-2.6 blockdiag
$ sudo easy_install-2.6 sphinxcontrib-blockdiag
$ sudo easy_install-2.6 seqdiag
$ sudo easy_install-2.6 sphinxcontrib-seqdiag
$ sudo easy_install-2.6 actdiag
$ sudo easy_install-2.6 sphinxcontrib-actdiag
$ sudo easy_install-2.6 nwdiag
$ sudo easy_install-2.6 sphinxcontrib-nwdiag

 After installing Sphinx, let's create a sample project to examine its befavior.
  • Setup the document source
$ sudo sphinx-quickstart

Please enter values for the following settings (just press Enter to
accept a default value, if one is given in brackets).

Enter the root path for documentation.
> Root path for the documentation [.]: project

You have two options for placing the build directory for Sphinx output.
Either, you use a directory "_build" within the root path, or you separate
"source" and "build" directories within the root path.
> Separate source and build directories (y/N) [n]: y

Inside the root directory, two more directories will be created; "_templates"
for custom HTML templates and "_static" for custom stylesheets and other static
files. You can enter another prefix (such as ".") to replace the underscore.
> Name prefix for templates and static dir [_]:

The project name will occur in several places in the built documentation.
> Project name: Experiment
> Author name(s): naoya hashimoto

Sphinx has the notion of a "version" and a "release" for the
software. Each version can have multiple releases. For example, for
Python the version is something like 2.5 or 3.0, while the release is
something like 2.5.1 or 3.0a1.  If you don't need this dual structure,
just set both to the same value.
> Project version: 1.00
> Project release [1.00]:

The file name suffix for source files. Commonly, this is either ".txt"
or ".rst".  Only files with this suffix are considered documents.
> Source file suffix [.rst]:

One document is special in that it is considered the top node of the
"contents tree", that is, it is the root of the hierarchical structure
of the documents. Normally, this is "index", but if your "index"
document is a custom template, you can also set this to another filename.
> Name of your master document (without suffix) [index]:

Sphinx can also add configuration for epub output:
> Do you want to use the epub builder (y/N) [n]: y

Please indicate if you want to use one of the following Sphinx extensions:
> autodoc: automatically insert docstrings from modules (y/N) [n]: y
> doctest: automatically test code snippets in doctest blocks (y/N) [n]: y
> intersphinx: link between Sphinx documentation of different projects (y/N) [n]: y
> todo: write "todo" entries that can be shown or hidden on build (y/N) [n]: y
> coverage: checks for documentation coverage (y/N) [n]: y
> pngmath: include math, rendered as PNG images (y/N) [n]: y
> mathjax: include math, rendered in the browser by MathJax (y/N) [n]: y
Note: pngmath and mathjax cannot be enabled at the same time.
pngmath has been deselected.
> ifconfig: conditional inclusion of content based on config values (y/N) [n]: y
> viewcode: include links to the source code of documented Python objects (y/N) [n]: y

A Makefile and a Windows command file can be generated for you so that you
only have to run e.g. `make html' instead of invoking sphinx-build
directly.
> Create Makefile? (Y/n) [y]: y
> Create Windows command file? (Y/n) [y]: y

Creating file project/source/conf.py.
Creating file project/source/index.rst.
Creating file project/Makefile.
Creating file project/make.bat.

Finished: An initial directory structure has been created.

You should now populate your master file project/source/index.rst and create other documentation
source files. Use the Makefile to build the docs, like so:
   make builder
where "builder" is one of the supported builders, e.g. html, latex or linkcheck.
  •  Define document structure

$ cd project
$ vim source/index.rst
 1 .. Experiment documentation master file, created by
 2    sphinx-quickstart on Sun Dec 11 15:10:10 2011.
 3    You can adapt this file completely to your liking, but it should at least
 4    contain the root `toctree` directive.
 5
 6 Welcome to Experiment's documentation!
 7 ======================================
 8
 9 Contents:
10
11 .. toctree::
12    :maxdepth: 2
13
14    expert_python  : ←Added the title of child file
15
16 Indices and tables
17 ==================
18
19 * :ref:`genindex`
20 * :ref:`modindex`
21 * :ref:`search`
  •  Add content

$ cat > source/expert_python.rst << EOF
=========================
Expert Python Programming
=========================

:著者: Tarek
:出版社: Packt Publishing

内容
====

Pythonのエキスパート向けの本。Pythonの内部のアルゴリズムにも言及しつつ
マニアックな文法の適切な使い方の紹介に始まり、アジャイルソフトウェア開発
をPythonで行うためのテストツール、継続的インテグレーションのツールなどの
紹介や、よりよいプログラムのための、Pythonのためのデザインパターン、
パフォーマンスチューニングなど、幅広く、深い内容の本。
EOF
  •  Build the content
$ make html
sphinx-build -b html -d build/doctrees   source build/html
Making output directory...
Running Sphinx v1.1.2
loading pickled environment... not yet created
loading intersphinx inventory from http://docs.python.org/objects.inv...
building [html]: targets for 2 source files that are out of date
updating environment: 2 added, 0 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index
writing additional files... (0 module code pages) genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

Build finished. The HTML pages are in build/html.
  •  See the content




















Finally, it's just practical, but I'll show a tiny example of listing the activate services with chkconfig command.

{ echo "
chkconfig list
--------------

.. csv-table:: Daemon
   :header: "Daemon", "0", "1", "2", "3", "4", "5", "6"
   :widths: 50, 10, 10, 10, 10, 10, 10, 10

" ;
chkconfig --list | sort | perl -pe 's/\s*(\d:)?([^\s]+) */,"$2"/g' | sed -e 's/^,/   /' ;
} >> system_daemon.rst

  • the source file
$ cat system_daemon.rst
.. csv-table:: Daemon
   :header: "Daemon", "0", "1", "2", "3", "4", "5", "6"
   :widths: 50, 10, 10, 10, 10, 10, 10, 10

   "NetworkManager","off","off","off","off","off","off","off"
   "acpid","off","off","on","on","on","on","off"
   "alfresco","off","off","on","on","on","on","off"
   "anacron","off","off","on","on","on","on","off"
   "atd","off","off","off","off","off","off","off"
   "auditd","off","off","on","on","on","on","off"
   "autofs","off","off","off","off","off","off","off"
   "avahi-daemon","off","off","off","off","off","off","off"
   "avahi-dnsconfd","off","off","off","off","off","off","off"
   "bluetooth","off","off","off","off","off","off","off"
   "conman","off","off","off","off","off","off","off"
   "cpuspeed","off","on","on","on","on","on","off"
   "crond","off","off","on","on","on","on","off"
   "dnsmasq","off","off","off","off","off","off","off"
   "dund","off","off","off","off","off","off","off"
   "firstboot","off","off","off","on","off","on","off"
   "gpm","off","off","on","on","on","on","off"
   "haldaemon","off","off","off","off","off","off","off"
   "hidd","off","off","on","on","on","on","off"
   "ip6tables","off","off","off","off","off","off","off"
   "iptables","off","off","off","off","off","off","off"
   "irda","off","off","off","off","off","off","off"
   "irqbalance","off","off","on","on","on","on","off"
   "iscsi","off","off","off","off","off","off","off"
   "iscsid","off","off","off","off","off","off","off"
   "kudzu","off","off","off","off","off","off","off"
   "lvm2-monitor","off","on","on","on","on","on","off"
   "mcstrans","off","off","off","off","off","off","off"
   "mdmonitor","off","off","off","off","off","off","off"
   "mdmpd","off","off","off","off","off","off","off"
   "messagebus","off","off","off","on","on","on","off"
   "microcode_ctl","off","off","on","on","on","on","off"
   "multipathd","off","off","off","off","off","off","off"
   "netconsole","off","off","off","off","off","off","off"
   "netfs","off","off","off","off","off","off","off"
   "netplugd","off","off","off","off","off","off","off"
   "network","off","off","on","on","on","on","off"
   "nfs","off","off","off","off","off","off","off"
   "nfslock","off","off","off","off","off","off","off"
   "nscd","off","off","off","off","off","off","off"
   "oddjobd","off","off","off","off","off","off","off"
   "pand","off","off","off","off","off","off","off"
   "pcscd","off","off","off","off","off","off","off"
   "portmap","off","off","off","off","off","off","off"
   "postgresql-9.1","off","off","on","on","on","on","off"
   "psacct","off","off","off","off","off","off","off"
   "rawdevices","off","off","off","on","on","on","off"
   "rdisc","off","off","off","off","off","off","off"
   "readahead_early","off","off","on","on","on","on","off"
   "readahead_later","off","off","off","off","off","on","off"
   "restorecond","off","off","off","off","off","off","off"
   "rpcgssd","off","off","off","off","off","off","off"
   "rpcidmapd","off","off","off","off","off","off","off"
   "rpcsvcgssd","off","off","off","off","off","off","off"
   "saslauthd","off","off","off","off","off","off","off"
   "sendmail","off","off","off","off","off","off","off"
   "smartd","off","off","off","off","off","off","off"
   "sshd","off","off","on","on","on","on","off"
   "svnserve","off","off","off","off","off","off","off"
   "syslog","off","off","on","on","on","on","off"
   "tcsd","off","off","off","off","off","off","off"
   "vmware-tools","off","off","on","on","off","on","off"
   "wpa_supplicant","off","off","off","off","off","off","off"
   "xfs","off","off","off","off","off","off","off"
   "ypbind","off","off","off","off","off","off","off"
   "yum-updatesd","off","off","off","off","off","off","off"


  • the output image

Daemon
Daemon 0 1 2 3 4 5 6
NetworkManager off off off off off off off
acpid off off on on on on off
alfresco off off on on on on off
anacron off off on on on on off
atd off off off off off off off
auditd off off on on on on off
autofs off off off off off off off
avahi-daemon off off off off off off off
avahi-dnsconfd off off off off off off off
bluetooth off off off off off off off
conman off off off off off off off
cpuspeed off on on on on on off
crond off off on on on on off
dnsmasq off off off off off off off
dund off off off off off off off
firstboot off off off on off on off
gpm off off on on on on off
haldaemon off off off off off off off
hidd off off on on on on off
ip6tables off off off off off off off
iptables off off off off off off off
irda off off off off off off off
irqbalance off off on on on on off
iscsi off off off off off off off
iscsid off off off off off off off
kudzu off off off off off off off
lvm2-monitor off on on on on on off
mcstrans off off off off off off off
mdmonitor off off off off off off off
mdmpd off off off off off off off
messagebus off off off on on on off
microcode_ctl off off on on on on off
multipathd off off off off off off off
netconsole off off off off off off off
netfs off off off off off off off
netplugd off off off off off off off
network off off on on on on off
nfs off off off off off off off
nfslock off off off off off off off
nscd off off off off off off off
oddjobd off off off off off off off
pand off off off off off off off
pcscd off off off off off off off
portmap off off off off off off off
postgresql-9.1 off off on on on on off
psacct off off off off off off off
rawdevices off off off on on on off
rdisc off off off off off off off
readahead_early off off on on on on off
readahead_later off off off off off on off
restorecond off off off off off off off
rpcgssd off off off off off off off
rpcidmapd off off off off off off off
rpcsvcgssd off off off off off off off
saslauthd off off off off off off off
sendmail off off off off off off off
smartd off off off off off off off
sshd off off on on on on off
svnserve off off off off off off off
syslog off off on on on on off
tcsd off off off off off off off
vmware-tools off off on on off on off
wpa_supplicant off off off off off off off
xfs off off off off off off off
ypbind off off off off off off off
yum-updatesd off off off off off off off

Friday, December 16, 2011

Document Management - Knowledge Tree

As explaining about installing Alfresco before, it's time to install Knowledge Tree this time.
  • System Information
The platform is the same with before when installing Alfresco.
OS                   CentOS release 5.7 (Final)        
Kernel         2.6.18-274.el5 x86_64
Package group   Development Libraries
 Dialup Networking Support
 GNOME Software Development
 Legacy Software Development
 Legacy Software Support
 Mail Server
 Network Servers
 System Tools
 Yum Utilities
Knowledge Tree 3.7


Let's see the process of installing Knowledge Tree.
# echo "exclude=mod-php-5.3*,php-5.3*,zend-server-php-5.3*" >> /etc/yum.conf
  • Get the binary and extract it
# wget http://repos.knowledgetree.com/downloads/ktdms-ce-linux-latest
# tar zxf kt-ce-linux-universal-installer-3.7.tgz
# cd knowledgetree-ce-linux-universal-installer-3.7
  •  Install Knowledge Tree ( After hitting <Enter>, the script begins installing via YUM.
# ./knowledgetree-community.sh
Running this script will preform the following:
* Configure your package manager to use the KnowledgeTree repository
* Install KnowledgeTree on your system using your package manager

Hit ENTER to install KnowledgeTree, or Ctrl+C to abort now.
...
Complete!

KnowledgeTree was successfully installed.

Please open your Internet browser and browse to http://127.0.0.1/KnowledgeTree/ to continue setup.
  • show the listening sockets
# netstat -lnpt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:10081               0.0.0.0:*                   LISTEN      6996/lighttpd
tcp        0      0 0.0.0.0:10082               0.0.0.0:*                   LISTEN      6996/lighttpd
tcp        0      0 127.0.0.1:10083             0.0.0.0:*                   LISTEN      6801/httpd
tcp        0      0 0.0.0.0:80                  0.0.0.0:*                   LISTEN      6801/httpd
tcp        0      0 0.0.0.0:22                  0.0.0.0:*  
    
    
    • check that the service of zend-server is enable
    # chkconfig --list zend-server
    zend-server     0:off   1:off   2:on    3:on    4:on    5:on    6:off
    • initialize the mysql data directory
    # mysql_install_db --datadir=/var/lib/mysql
    • begin the mysql daemon
    # /etc/init.d/mysql start
    MySQL を起動中:                                            [  OK  ]
    • secure mysql users, drop test database
    # mysql_secure_installation
    ...
    Set root password? [Y/n] Y
    New password:
    Re-enter new password:
    ...
    Remove anonymous users? [Y/n] Y
    ...
    Disallow root login remotely? [Y/n] Y
    ...
    Remove test database and access to it? [Y/n] Y
    ...
    Reload privilege tables now? [Y/n] Y
    ...
    Thanks for using MySQL!
    • license agreement

    • installation type
    • checking PHP dependencies
    • system configuration
    • checking service dependencies
    • confirming database configuration
    • registering Knowledge Tree
    • finalizing system information
    • installation completed
    • enter the login page
    • Dashboard
    • setup mysql user for Knowledge Tree (When you failed in loggin in with the password, mysql user might not have been created.)
    # mysql -uroot -p -e "GRANT ALL ON dms.* TO dmsadmin@'127.0.0.1' IDENTIFIED BY 'password'; FLUSH PRIVILEGES;"

    Sunday, December 11, 2011

    Document Management - Alfresco

    Recenty, I've got more opportunities of making document, such as technical specifications, operating manuals, report for stress test, etc than before. The more I make document, the more it costs to manage several kinds of document. Especially, it is quite tiresome to even upload the document which I make at local PC to the fileserver, when it needs to be updated often.

    To reduce the cost to manage the document, I consider introducing DMS(Document Management System) or ECM(Enterprise Content Management) systems. In addtion, I wanted to write operating manuals more efficiently and easier, alternative to wiki such as pukwiki or MS Office products.

    Currently, I am thinking of using the software below and list the installation of them.
    I will list the details or pros and cons after introducing the installations of them.


    • System Information
    The summary of the OS and the Middle wares is below.

    OS                   CentOS release 5.7 (Final)        
    Kernel         2.6.18-274.el5 x86_64
    Package group   Development Libraries
     Dialup Networking Support
     GNOME Software Development
     Legacy Software Development
     Legacy Software Support
     Mail Server
     Network Servers
     System Tools
     Yum Utilities
    Alfresco alfresco-community-4.0.b


    First of all, I'm going to list the process of installing Alfresco.
    • Get the binary
    # wget http://dl.alfresco.com/release/community/build-3835/alfresco-community-4.0.b-installer-linux-x64.bin 
    • Install alfresco ( The comment is Japanese as I chose the installtion language as Japanese)
    
    
    # chmod +x ./alfresco-community-4.0.b-installer-linux-x64.bin
    # ./alfresco-community-4.0.b-installer-linux-x64.bin --mode text
    Language Selection
    
     Please select the installation language
     [1] English - English
     [2] French - Francais
     [3] Spanish - Espanol
     [4] Italian - Italiano
     [5] German - Deutsch
     [6] Japanese - 日本語
     Please choose an option [1] : 6
     ----------------------------------------------------------------------------
     ようこそ Alfresco Community セットアップウィザードへ。
    
     ----------------------------------------------------------------------------
     インストール・タイプ
    
     [1] 簡易 - サーバをデフォルト設定でインストールします
     [2] Advanced - Configures server ports and service properties.: Also choose optional components to install.
     オプションを選択してください [1] : 2
    
     ----------------------------------------------------------------------------
     インストールするコンポーネントを選択してください。準備ができたら“次へ“をクリックしてください。
    
     Java [Y/n] :Y
    
     PostgreSQL [Y/n] :Y
    
     Alfresco : Y (Cannot be edited)
    
     SharePoint [Y/n] :Y
    
     Web Quick Start [y/N] : Y
    
     OpenOffice [Y/n] :Y
    
     上記選択部分が正しいことを確認してください。 [Y/n]: Y
    
     ----------------------------------------------------------------------------
     インストール・フォルダ
    
     Alfresco Communityをインストールするフォルダを選んでください
    
     フォルダを選択して下さい。 [/opt/alfresco-4.0.b]:
    
     ----------------------------------------------------------------------------
     データベースのサーバ・パラメータ
    
     データベースのポートを入力してください
    
     データベースのサーバ・ポート [5432]:
    
     ----------------------------------------------------------------------------
     Tomcat ポート設定
    
     TOMCATに必要なパラメータを入力してください。
    
     Webサーバドメイン: [127.0.0.1]:
    
     Tomcat サーバ・ポート: [8080]:
    
     Tomcat シャットダウン・ポート: [8005]:
    
     Tomcat SSL ポート [8443]:
    
     Tomcat AJP ポート: [8009]:
    
     ----------------------------------------------------------------------------
     Alfresco FTPポート
    
     統合されたAlfrescoFTPサーバに使用するポート番号を選択してください。
    
     ポート: [21]:
    
     ----------------------------------------------------------------------------
     Alfresco RMIポート
    
     リモートコマンドの実行に使用するAlfrescoのポート番号を選択してください。
    
     ポート: [50500]:
    
     ----------------------------------------------------------------------------
     Adminパスワード
    
     Alfrescoの管理者アカウントに使用するパスワードを入力します。
    
     Adminパスワード: :
     繰り返しパスワードを入力して下さい。 :
     ----------------------------------------------------------------------------
     Alfresco SharePointポート
    
     SharePointプロトコル用のポート番号を選択してください。
    
     ポート: [7070]:
    
     ----------------------------------------------------------------------------
     サービスとしてインストール
    
     オプションで、Alfresco Community をサービスとして登録する事ができます。 これによりマシンを起動する度にそれを自動起動できるようになります。
    
     サービスとして Alfresco Community をインストールしますか。 [Y/n]: y
    
    
     ----------------------------------------------------------------------------
     OpenOfficeサーバ・ポート
    
     OpenOFFICEサーバのデフォルト LISTENポートを入力してください。
    
     OpenOfficeサーバ・ポート [8100]:
    
     ----------------------------------------------------------------------------
     お使いのコンピュータに Alfresco Community をインストールする準備が整いました。
    
     続けますか? [Y/n]: y
    
     ----------------------------------------------------------------------------
     しばらくお待ちください。 Alfresco Community をお使いのコンピュータにインストール中です。
    
     インストール中
     0% ______________ 50% ______________ 100%
     #########################################
    
     ----------------------------------------------------------------------------
     セットアップウィザードによる Alfresco Community のインストールが完了しました。
    
     Readme ファイルを表示 [Y/n]: Y
    
     Alfresco Community Shareを起動 [Y/n]: n
    
     README
     Alfresco Community 4.0
     ======================
    
     For Enterprise subscribers, refer to http://support.alfresco.com for release
     notes and detailed information on this release.
    
     For Community members, refer to the Alfresco wiki for more information on this
     release.
    
     続けるには [Enter] キーを押してください :
    • Activate the service
    
    
    # chkconfig alfresco on
    
    
    • Start alfresco
    # /etc/init.d/alfresco start
    /opt/alfresco-4.0.b/postgresql/scripts/ctl.sh : postgresql  started at port 5432
    Using CATALINA_BASE:   /opt/alfresco-4.0.b/tomcat
    Using CATALINA_HOME:   /opt/alfresco-4.0.b/tomcat
    Using CATALINA_TMPDIR: /opt/alfresco-4.0.b/tomcat/temp
    Using JRE_HOME:        /opt/alfresco-4.0.b/java
    Using CLASSPATH:       /opt/alfresco-4.0.b/tomcat/bin/bootstrap.jar
    /opt/alfresco-4.0.b/tomcat/scripts/ctl.sh : tomcat started 
    • Check the startup
    # /etc/init.d/alfresco status
    tomcat already running
    postgresql already running 
    • show the listening sockets
    # netstat -lntp | sort -k 5
    tcp        0      0 0.0.0.0:21                  0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      2149/sshd
    tcp        0      0 0.0.0.0:139                 0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:445                 0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:775                 0.0.0.0:*                   LISTEN      1865/rpc.statd
    tcp        0      0 0.0.0.0:7070                0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:8009                0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:8080                0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:8443                0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:47859               0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:50500               0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:50508               0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:56950               0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 0.0.0.0:59863               0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 127.0.0.1:5432              0.0.0.0:*                   LISTEN      12387/postgres
    tcp        0      0 127.0.0.1:8005              0.0.0.0:*                   LISTEN      12407/java
    tcp        0      0 127.0.0.1:8100              0.0.0.0:*                   LISTEN      12502/soffice.bin
    ID admin
    PW admin
    • Dash board

        Saturday, November 19, 2011

        MySQL MHA - init script

        As I wrote a typical init script for mhamanager and verified the each process of running, stopping, showing the status.
        Tough I used to open the script here, I moved it to my github.

        • start daemon
        # /etc/init.d/mhamanager start
        Starting /etc/init.d/mhamanager: nohup: appending output to `nohup.out' 
        • verify if daemon stops
        # /etc/init.d/mhamanager status
        app1 (pid:15293) is running(0:PING_OK), master:ha-db01
        • stop
        # /etc/init.d/mhamanager stop
        Shutting down /etc/init.d/mhamanager: Stopped app1 successfully.
        • verify if daemon is running
        # /etc/init.d/mhamanager status
        app1 is stopped(2:NOT_RUNNING). 
        • show usage
        # /etc/init.d/mhamanager
        Usage: /etc/init.d/mhamanager {start|stop|restart|condrestart|status|checkrepl}

        MySQL MHA - Failover

        Lastly, We'll see the mysql master failover. I'm going to make it happen by stopping mysql daemon.
        The relations between Master and Slave, MHA Node and MHA Master have been the same before.

        Moreover, configuration files such as global and application are too.
        Please see the MySQL MHA - Switchover if you need to read the detail of those configuration files.

        DB(Master) + MHA Manager192.168.100.200(ha-mgr01)
        DB(Slave) + MHA Node192.168.100.197(ha-db01)
        DB(Slave) + MHA Node192.168.100.198(ha-db02)

        Firstly, stop Master DB(ha-db01)
        Secondly, transfer Master_Host from Master(ha-db01) to Slave(ha-db02)

        • Currently logged in MHA Manger(ha-mgr01)
        • run node manager
        # masterha_manager --conf=/etc/app1.cnf
        Thu Jul 28 12:27:19 2011 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
        Thu Jul 28 12:27:19 2011 - [info] Reading application default configurations from /etc/app1.cnf..
        Thu Jul 28 12:27:19 2011 - [info] Reading server configurations from /etc/app1.cnf.. 
        
        • check the status of Master server
        # masterha_check_status --conf=/etc/app1.cnf
        app1 (pid:26638) is running(0:PING_OK), master:ha-db01 
        
        • stop mysql daemon on Master DB
        # ssh ha-db01 '/etc/init.d/mysql stop'
        • check the status of Master server
        # masterha_check_status --conf=/etc/app1.cnf
        app1 is stopped(2:NOT_RUNNING).
        
        • check the status of slave hosts
        # mysql -uroot -pmysql -e 'SHOW SLAVE HOSTS\G' -h ha-db02
        *************************** 1. row ***************************
        Server_id: 300
             Host: 
             Port: 3306
        Master_id: 200
        
        • verify if the Master_Host has been transferred from Mster(ha-db01) to Slave(ha-db02)
        # mysql -uroo -p mysql -e 'SHOW SLAVE STATUS\G' -h localhost
        *************************** 1. row ***************************
                       Slave_IO_State: Waiting for master to send event
                          Master_Host: 192.168.100.198
                          Master_User: replication
                          Master_Port: 3306
                        Connect_Retry: 60
                      Master_Log_File: mysql-bin.000006
                  Read_Master_Log_Pos: 107
                       Relay_Log_File: ha-mgr01-relay-bin.000002
                        Relay_Log_Pos: 253
                Relay_Master_Log_File: mysql-bin.000006
                     Slave_IO_Running: Yes
                    Slave_SQL_Running: Yes
                      Replicate_Do_DB: 
                  Replicate_Ignore_DB: 
                   Replicate_Do_Table: 
               Replicate_Ignore_Table: 
              Replicate_Wild_Do_Table: 
          Replicate_Wild_Ignore_Table: 
                           Last_Errno: 0
                           Last_Error: 
                         Skip_Counter: 0
                  Exec_Master_Log_Pos: 107
                      Relay_Log_Space: 412
                      Until_Condition: None
                       Until_Log_File: 
                        Until_Log_Pos: 0
                   Master_SSL_Allowed: No
                   Master_SSL_CA_File: 
                   Master_SSL_CA_Path: 
                      Master_SSL_Cert: 
                    Master_SSL_Cipher: 
                       Master_SSL_Key: 
                Seconds_Behind_Master: 0
        Master_SSL_Verify_Server_Cert: No
                        Last_IO_Errno: 0
                        Last_IO_Error: 
                       Last_SQL_Errno: 0
                       Last_SQL_Error: 
          Replicate_Ignore_Server_Ids: 
                     Master_Server_Id: 200
        
        # mysql -uroot -pmysql -e 'SHOW SLAVE STATUS\G' -h ha-db02
        *************************** 1. row ***************************
                       Slave_IO_State: 
                          Master_Host: 192.168.100.197
                          Master_User: replication
                          Master_Port: 3306
                        Connect_Retry: 60
                      Master_Log_File: 
                  Read_Master_Log_Pos: 4
                       Relay_Log_File: mysqld-relay-bin.000001
                        Relay_Log_Pos: 4
                Relay_Master_Log_File: 
                     Slave_IO_Running: No
                    Slave_SQL_Running: No
                      Replicate_Do_DB: 
                  Replicate_Ignore_DB: 
                   Replicate_Do_Table: 
               Replicate_Ignore_Table: 
              Replicate_Wild_Do_Table: 
          Replicate_Wild_Ignore_Table: 
                           Last_Errno: 0
                           Last_Error: 
                         Skip_Counter: 0
                  Exec_Master_Log_Pos: 0
                      Relay_Log_Space: 126
                      Until_Condition: None
                       Until_Log_File: 
                        Until_Log_Pos: 0
                   Master_SSL_Allowed: No
                   Master_SSL_CA_File: 
                   Master_SSL_CA_Path: 
                      Master_SSL_Cert: 
                    Master_SSL_Cipher: 
                       Master_SSL_Key: 
                Seconds_Behind_Master: NULL
        Master_SSL_Verify_Server_Cert: No
                        Last_IO_Errno: 0
                        Last_IO_Error: 
                       Last_SQL_Errno: 0
                       Last_SQL_Error: 
          Replicate_Ignore_Server_Ids: 
                     Master_Server_Id: 100
        
        • Watching the MHA Master server's log over the failover
        # tail -f /var/log/masterha/app1/app1.log 
        Wed Jul 27 17:21:34 2011 - [info] MHA::MasterMonitor version 0.50.
        Wed Jul 27 17:21:35 2011 - [info] Dead Servers:
        Wed Jul 27 17:21:35 2011 - [info] Alive Servers:
        Wed Jul 27 17:21:35 2011 - [info]   ha-db01(192.168.100.197:3306)
        Wed Jul 27 17:21:35 2011 - [info]   ha-db02(192.168.100.198:3306)
        Wed Jul 27 17:21:35 2011 - [info]   ha-mgr01(192.168.100.200:3306)
        Wed Jul 27 17:21:35 2011 - [info] Alive Slaves:
        Wed Jul 27 17:21:35 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:21:35 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:21:35 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:21:35 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:21:35 2011 - [info] Current Master: ha-db01(192.168.100.197:3306)
        Wed Jul 27 17:21:35 2011 - [info] Checking slave configurations..
        Wed Jul 27 17:21:35 2011 - [warn]  read_only=1 is not set on slave ha-db02(192.168.100.198:3306).
        Wed Jul 27 17:21:35 2011 - [warn]  relay_log_purge=0 is not set on slave ha-db02(192.168.100.198:3306).
        Wed Jul 27 17:21:35 2011 - [warn]  read_only=1 is not set on slave ha-mgr01(192.168.100.200:3306).
        Wed Jul 27 17:21:35 2011 - [warn]  relay_log_purge=0 is not set on slave ha-mgr01(192.168.100.200:3306).
        Wed Jul 27 17:21:35 2011 - [info] Checking replication filtering settings..
        Wed Jul 27 17:21:35 2011 - [info]  binlog_do_db= , binlog_ignore_db= 
        Wed Jul 27 17:21:35 2011 - [info]  Replication filtering check ok.
        Wed Jul 27 17:21:35 2011 - [info] Starting SSH connection tests..
        Wed Jul 27 17:21:36 2011 - [info] All SSH connection tests passed successfully.
        Wed Jul 27 17:21:36 2011 - [info] Checking MHA Node version..
        Wed Jul 27 17:21:37 2011 - [info]  Version check ok.
        Wed Jul 27 17:21:37 2011 - [info] Checking SSH publickey authentication and checking recovery script configurations on the current master..
        Wed Jul 27 17:21:37 2011 - [info]   Executing command: save_binary_logs --command=test --start_file=mysql-bin.000044 --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.50 
        Wed Jul 27 17:21:37 2011 - [info]   Connecting to root@ha-db01(ha-db01).. 
          Creating /var/log/masterha/app1 if not exists..    ok.
          Checking output directory is accessible or not..
           ok.
          Binlog found at /var/lib/mysql, up to mysql-bin.000044
        Wed Jul 27 17:21:37 2011 - [info] Master setting check done.
        Wed Jul 27 17:21:37 2011 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
        Wed Jul 27 17:21:37 2011 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=ha-db02 --slave_ip=192.168.100.198 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --manager_version=0.50 --relay_log_info=/var/lib/mysql/relay-log.info  --slave_pass=xxx
        Wed Jul 27 17:21:37 2011 - [info]   Connecting to root@192.168.100.198(ha-db02).. 
          Checking slave recovery environment settings..
            Opening /var/lib/mysql/relay-log.info ... ok.
            Relay log found at /var/lib/mysql, up to mysqld-relay-bin.000018
            Temporary relay log file is /var/lib/mysql/mysqld-relay-bin.000018
            Testing mysql connection and privileges.. done.
            Testing mysqlbinlog output.. done.
            Cleaning up test file(s).. done.
        Wed Jul 27 17:21:38 2011 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=ha-mgr01 --slave_ip=192.168.100.200 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --manager_version=0.50 --relay_log_info=/var/lib/mysql/relay-log.info  --slave_pass=xxx
        Wed Jul 27 17:21:38 2011 - [info]   Connecting to root@192.168.100.200(ha-mgr01).. 
          Checking slave recovery environment settings..
            Opening /var/lib/mysql/relay-log.info ... ok.
            Relay log found at /var/lib/mysql, up to ha-mgr01-relay-bin.000002
            Temporary relay log file is /var/lib/mysql/ha-mgr01-relay-bin.000002
            Testing mysql connection and privileges.. done.
            Testing mysqlbinlog output.. done.
            Cleaning up test file(s).. done.
        Wed Jul 27 17:21:38 2011 - [info] Slaves settings check done.
        Wed Jul 27 17:21:38 2011 - [info] 
        ha-db01 (current master)
         +--ha-db02
         +--ha-mgr01
        Wed Jul 27 17:21:38 2011 - [warn] master_ip_failover_script is not defined.
        Wed Jul 27 17:21:38 2011 - [warn] shutdown_script is not defined.
        Wed Jul 27 17:21:38 2011 - [info] Set master ping interval 3 seconds.
        Wed Jul 27 17:21:38 2011 - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s 192.168.100.198 --user=root --master_ip=192.168.100.197 --master_port=3306 --master_host=ha-db01 -s 192.168.100.197
        Wed Jul 27 17:21:38 2011 - [info] Starting ping health check on ha-db01(192.168.100.197:3306)..
        Wed Jul 27 17:21:38 2011 - [info] Ping succeeded, sleeping until it doesn't respond..
        Wed Jul 27 17:22:53 2011 - [warn] Got error on MySQL ping: 2006 (MySQL server has gone away)
        Wed Jul 27 17:22:53 2011 - [info] Executing seconary network check script: /usr/local/bin/masterha_secondary_check -s 192.168.100.198 --user=root --master_ip=192.168.100.197 --master_port=3306 --master_host=ha-db01 -s 192.168.100.197  --user=root  --master_host=ha-db01  --master_ip=192.168.100.197  --master_port=3306
        Wed Jul 27 17:22:53 2011 - [info] HealthCheck: SSH to ha-db01 is reachable.
        Monitoring server 192.168.100.198 is reachable, Master is not reachable from 192.168.100.198. OK.
        Monitoring server 192.168.100.197 is reachable, Master is not reachable from 192.168.100.197. OK.
        Wed Jul 27 17:22:54 2011 - [info] Master is not reachable from all other monitoring servers. Failover should start.
        Wed Jul 27 17:22:56 2011 - [warn] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.197' (111))
        Wed Jul 27 17:22:56 2011 - [warn] Connection failed 1 time(s)..
        Wed Jul 27 17:22:59 2011 - [warn] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.197' (111))
        Wed Jul 27 17:22:59 2011 - [warn] Connection failed 2 time(s)..
        Wed Jul 27 17:23:02 2011 - [warn] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.197' (111))
        Wed Jul 27 17:23:02 2011 - [warn] Connection failed 3 time(s)..
        Wed Jul 27 17:23:02 2011 - [warn] Master is not reachable from health checker!
        Wed Jul 27 17:23:02 2011 - [warn] Master ha-db01(192.168.100.197:3306) is not reachable!
        Wed Jul 27 17:23:02 2011 - [warn] SSH is reachable.
        Wed Jul 27 17:23:02 2011 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/app1.cnf again, and trying to connect to all servers to check server status..
        Wed Jul 27 17:23:02 2011 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
        Wed Jul 27 17:23:02 2011 - [info] Reading application default configurations from /etc/app1.cnf..
        Wed Jul 27 17:23:02 2011 - [info] Reading server configurations from /etc/app1.cnf..
        Wed Jul 27 17:23:02 2011 - [info] Dead Servers:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db01(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] Alive Servers:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)
        Wed Jul 27 17:23:02 2011 - [info] Alive Slaves:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] Checking slave configurations..
        Wed Jul 27 17:23:02 2011 - [warn]  read_only=1 is not set on slave ha-db02(192.168.100.198:3306).
        Wed Jul 27 17:23:02 2011 - [warn]  relay_log_purge=0 is not set on slave ha-db02(192.168.100.198:3306).
        Wed Jul 27 17:23:02 2011 - [warn]  read_only=1 is not set on slave ha-mgr01(192.168.100.200:3306).
        Wed Jul 27 17:23:02 2011 - [warn]  relay_log_purge=0 is not set on slave ha-mgr01(192.168.100.200:3306).
        Wed Jul 27 17:23:02 2011 - [info] Checking replication filtering settings..
        Wed Jul 27 17:23:02 2011 - [info]  Replication filtering check ok.
        Wed Jul 27 17:23:02 2011 - [info] Master is down!
        Wed Jul 27 17:23:02 2011 - [info] Terminating monitoring script.
        Wed Jul 27 17:23:02 2011 - [info] Got exit code 20 (Master dead).
        Wed Jul 27 17:23:02 2011 - [info] MHA::MasterFailover version 0.50.
        Wed Jul 27 17:23:02 2011 - [info] Starting master failover.
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] * Phase 1: Configuration Check Phase..
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] Dead Servers:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db01(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] Checking master reachability via mysql(double check)..
        Wed Jul 27 17:23:02 2011 - [info]  ok.
        Wed Jul 27 17:23:02 2011 - [info] Alive Servers:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)
        Wed Jul 27 17:23:02 2011 - [info] Alive Slaves:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] ** Phase 1: Configuration Check Phase completed.
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] * Phase 2: Dead Master Shutdown Phase..
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] Forcing shutdown so that applications never connect to the current master..
        Wed Jul 27 17:23:02 2011 - [warn] master_ip_failover_script is not set. Skipping invalidating dead master ip address.
        Wed Jul 27 17:23:02 2011 - [warn] shutdown_script is not set. Skipping explicit shutting down of the dead master.
        Wed Jul 27 17:23:02 2011 - [info] * Phase 2: Dead Master Shutdown Phase completed.
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] * Phase 3: Master Recovery Phase..
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] * Phase 3.1: Getting Latest Slaves Phase..
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] The latest binary log file/position on all slaves is mysql-bin.000044:107
        Wed Jul 27 17:23:02 2011 - [info] Latest slaves (Slaves that received relay log files to the latest):
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] The oldest binary log file/position on all slaves is mysql-bin.000044:107
        Wed Jul 27 17:23:02 2011 - [info] Oldest slaves:
        Wed Jul 27 17:23:02 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
        Wed Jul 27 17:23:02 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
        Wed Jul 27 17:23:02 2011 - [info] 
        Wed Jul 27 17:23:02 2011 - [info] Fetching dead master's binary logs..
        Wed Jul 27 17:23:02 2011 - [info] Executing command on the dead master ha-db01(192.168.100.197:3306): save_binary_logs --command=save --start_file=mysql-bin.000044  --start_pos=107 --binlog_dir=/var/lib/mysql --output_file=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50
          Creating /var/log/masterha/app1 if not exists..    ok.
         Concat binary/relay logs from mysql-bin.000044 pos 107 to mysql-bin.000044 EOF into /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog ..
          Dumping binlog format description event, from position 0 to 107.. ok.
          Dumping effective binlog data from /var/lib/mysql/mysql-bin.000044 position 107 to tail(126).. ok.
         Concat succeeded.
        Wed Jul 27 17:23:03 2011 - [info] scp from root@192.168.100.197:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog to local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog succeeded.
        Wed Jul 27 17:23:03 2011 - [info] HealthCheck: SSH to ha-db02 is reachable.
        Wed Jul 27 17:23:04 2011 - [info] HealthCheck: SSH to ha-mgr01 is reachable.
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info] * Phase 3.3: Determining New Master Phase..
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
        Wed Jul 27 17:23:04 2011 - [info] All slaves received relay logs to the same position. No need to resync each other.
        Wed Jul 27 17:23:04 2011 - [info] Searching new master from slaves..
        Wed Jul 27 17:23:04 2011 - [info]  Candidate masters from the configuration file:
        Wed Jul 27 17:23:04 2011 - [info]  Non-candidate masters:
        Wed Jul 27 17:23:04 2011 - [info] New master is ha-db02(192.168.100.198:3306)
        Wed Jul 27 17:23:04 2011 - [info] Starting master failover..
        Wed Jul 27 17:23:04 2011 - [info] 
        From:
        ha-db01 (current master)
         +--ha-db02
         +--ha-mgr01
        To:
        ha-db02 (new master)
         +--ha-mgr01
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
        Wed Jul 27 17:23:04 2011 - [info] Sending binlog..
        Wed Jul 27 17:23:04 2011 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog to root@ha-db02:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog succeeded.
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info] * Phase 3.4: Master Log Apply Phase..
        Wed Jul 27 17:23:04 2011 - [info] 
        Wed Jul 27 17:23:04 2011 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
        Wed Jul 27 17:23:04 2011 - [info] Starting recovery on ha-db02(192.168.100.198:3306)..
        Wed Jul 27 17:23:04 2011 - [info]  Generating diffs succeeded.
        Wed Jul 27 17:23:04 2011 - [info] Waiting until all relay logs are applied.
        Wed Jul 27 17:23:04 2011 - [info]  done.
        Wed Jul 27 17:23:05 2011 - [info] Getting slave status..
        Wed Jul 27 17:23:05 2011 - [info] This slave(ha-db02)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000044:107). No need to recover from Exec_Master_Log_Pos.
        Wed Jul 27 17:23:05 2011 - [info] Connecting to the target slave host ha-db02, running recover script..
        Wed Jul 27 17:23:05 2011 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=ha-db02 --slave_ip=192.168.100.198  --slave_port=3306 --apply_files=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --timestamp=20110727172302 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50 --slave_pass=xxx
        Wed Jul 27 17:23:05 2011 - [info] 
        Applying differential binary/relay log files /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog on ha-db02:3306. This may take long time...
        Applying log files succeeded.
        Wed Jul 27 17:23:05 2011 - [info]  All relay logs were successfully applied.
        Wed Jul 27 17:23:05 2011 - [info] Getting new master's binlog name and position..
        Wed Jul 27 17:23:05 2011 - [info]  mysql-bin.000006:107
        Wed Jul 27 17:23:05 2011 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='ha-db02 or 192.168.100.198', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000006', MASTER_LOG_POS=107, MASTER_USER='replication', MASTER_PASSWORD='xxx';
        Wed Jul 27 17:23:05 2011 - [warn] master_ip_failover_script is not set. Skipping taking over new master ip address.
        Wed Jul 27 17:23:05 2011 - [info] ** Finished master recovery successfully.
        Wed Jul 27 17:23:05 2011 - [info] * Phase 3: Master Recovery Phase completed.
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] * Phase 4: Slaves Recovery Phase..
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] -- Slave diff file generation on host ha-mgr01(192.168.100.200:3306) started, pid: 23533. Check tmp log /var/log/masterha/app1/ha-mgr02_3306_20110727172302.log if it takes time..
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] Log messages from ha-mgr01 ...
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
        Wed Jul 27 17:23:05 2011 - [info] End of log messages from ha-mgr01.
        Wed Jul 27 17:23:05 2011 - [info] -- ha-mgr01(192.168.100.200:3306) has the latest relay log events.
        Wed Jul 27 17:23:05 2011 - [info] Generating relay diff files from the latest slave succeeded.
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
        Wed Jul 27 17:23:05 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] -- Slave recovery on host ha-mgr01(192.168.100.200:3306) started, pid: 23535. Check tmp log /var/log/masterha/app1/ha-mgr02_3306_20110727172302.log if it takes time..
        Wed Jul 27 17:23:06 2011 - [info] 
        Wed Jul 27 17:23:06 2011 - [info] Log messages from ha-mgr01 ...
        Wed Jul 27 17:23:06 2011 - [info] 
        Wed Jul 27 17:23:05 2011 - [info] Sending binlog..
        Wed Jul 27 17:23:05 2011 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog to root@ha-mgr01:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog succeeded.
        Wed Jul 27 17:23:05 2011 - [info] Starting recovery on ha-mgr01(192.168.100.200:3306)..
        Wed Jul 27 17:23:05 2011 - [info]  Generating diffs succeeded.
        Wed Jul 27 17:23:05 2011 - [info] Waiting until all relay logs are applied.
        Wed Jul 27 17:23:05 2011 - [info]  done.
        Wed Jul 27 17:23:05 2011 - [info] Getting slave status..
        Wed Jul 27 17:23:05 2011 - [info] This slave(ha-mgr01)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000044:107). No need to recover from Exec_Master_Log_Pos.
        Wed Jul 27 17:23:05 2011 - [info] Connecting to the target slave host ha-mgr01, running recover script..
        Wed Jul 27 17:23:05 2011 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=ha-mgr01 --slave_ip=192.168.100.200  --slave_port=3306 --apply_files=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --timestamp=20110727172302 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50 --slave_pass=xxx
        Wed Jul 27 17:23:06 2011 - [info] 
        Applying differential binary/relay log files /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110727172302.binlog on ha-mgr01:3306. This may take long time...
        Applying log files succeeded.
        Wed Jul 27 17:23:06 2011 - [info]  All relay logs were successfully applied.
        Wed Jul 27 17:23:06 2011 - [info]  Resetting slave ha-mgr01(192.168.100.200:3306) and starting replication from the new master ha-db02(192.168.100.198:3306)..
        Wed Jul 27 17:23:06 2011 - [info]  Executed CHANGE MASTER.
        Wed Jul 27 17:23:06 2011 - [info]  Slave started.
        Wed Jul 27 17:23:06 2011 - [info] End of log messages from ha-mgr01.
        Wed Jul 27 17:23:06 2011 - [info] -- Slave recovery on host ha-mgr01(192.168.100.200:3306) succeeded.
        Wed Jul 27 17:23:06 2011 - [info] All new slave servers recovered successfully.
        Wed Jul 27 17:23:06 2011 - [info] 
        Wed Jul 27 17:23:06 2011 - [info] * Phase 5: New master cleanup phease..
        Wed Jul 27 17:23:06 2011 - [info] 
        Wed Jul 27 17:23:06 2011 - [info] Resetting slave info on the new master..
        Wed Jul 27 17:23:06 2011 - [info] Master failover to ha-db02(192.168.100.198:3306) completed successfully.
        Wed Jul 27 17:23:06 2011 - [info] 
        ----- Failover Report -----
        app1: MySQL Master failover ha-db01 to ha-db02 succeeded
        Master ha-db01 is down!
        Check MHA Manager logs at ha-mgr01.forschooner.net:/var/log/masterha/app1/app1.log for details.
        Started automated(non-interactive) failover.
        The latest slave ha-db02(192.168.100.198:3306) has all relay logs for recovery.
        Selected ha-db02 as a new master.
        ha-db02: OK: Applying all logs succeeded.
        ha-mgr01: This host has the latest relay log events.
        Generating relay diff files from the latest slave succeeded.
        ha-mgr01: OK: Applying all logs succeeded. Slave started, replicating from ha-db02.
        ha-db02: Resetting slave info succeeded.
        Master failover to ha-db02(192.168.100.198:3306) completed successfully.
        Wed Jul 27 17:23:06 2011 - [info] Sending mail..
        Unknown option: conf
        

          Wednesday, August 3, 2011

          MySQL MHA - switchover

          As MySQL MHA Node and Manager have been installed, next is switching master server manually. 

          The relation Master and Slaves, including MHA manager and nodes, is below. 

          ha-db01:   Master + MHA Node → MySQL stopped 
          ha-db02:   Slave + MHA Node → Master  
          ha-mgr01: Slave + MHA Manager → Slave 

          This is how switchover occurs.
          1. stop mysql on master. 
          2. switch master to Slave(ha-db02) manually.
          3. slave server(ha-mgr02) recognizes ha-db02 as master.
          • Currently logged in MHA Manger(ha-mgr01)
          • Setting up hosts file to connect via hostname
          # cat >> /etc/hosts <<EOF
          192.168.100.197 ha-db01
          192.168.100.198 ha-db02
          192.168.100.200 ha-mgr01
          EOF
          • Displaying a list of replication slaves registered with Master
          # mysql -uroot -pmysql -e 'SHOW SLAVE HOSTS\G' -h ha-db01
          *************************** 1. row ***************************
          Server_id: 300
               Host: 
               Port: 3306
          Master_id: 100
          *************************** 2. row ***************************
          Server_id: 200
               Host: 
               Port: 3306
          Master_id: 100
          
          • Checking the replication status of the slave servers
          # mysql -uroot -pmysql -e 'SHOW SLAVE STATUS\G' -h ha-db02
          *************************** 1. row ***************************
                         Slave_IO_State: Waiting for master to send event
                            Master_Host: 192.168.100.197
                            Master_User: replication
                            Master_Port: 3306
                          Connect_Retry: 60
                        Master_Log_File: mysql-bin.000045
                    Read_Master_Log_Pos: 107
                         Relay_Log_File: mysqld-relay-bin.00001
                          Relay_Log_Pos: 253
                  Relay_Master_Log_File: mysql-bin.000045
                       Slave_IO_Running: Yes
                      Slave_SQL_Running: Yes
                        Replicate_Do_DB: 
                    Replicate_Ignore_DB: 
                     Replicate_Do_Table: 
                 Replicate_Ignore_Table: 
                Replicate_Wild_Do_Table: 
            Replicate_Wild_Ignore_Table: 
                             Last_Errno: 0
                             Last_Error: 
                           Skip_Counter: 0
                    Exec_Master_Log_Pos: 107
                        Relay_Log_Space: 556
                        Until_Condition: None
                         Until_Log_File: 
                          Until_Log_Pos: 0
                     Master_SSL_Allowed: No
                     Master_SSL_CA_File: 
                     Master_SSL_CA_Path: 
                        Master_SSL_Cert: 
                      Master_SSL_Cipher: 
                         Master_SSL_Key: 
                  Seconds_Behind_Master: 0
          Master_SSL_Verify_Server_Cert: No
                          Last_IO_Errno: 0
                          Last_IO_Error: 
                         Last_SQL_Errno: 0
                         Last_SQL_Error: 
            Replicate_Ignore_Server_Ids: 
                       Master_Server_Id: 100 
          
          # mysql -uroot -pmysql -e 'SHOW SLAVE STATUS\G' -h localhost
          *************************** 1. row ***************************
                         Slave_IO_State: Waiting for master to send event
                            Master_Host: 192.168.100.197
                            Master_User: replication
                            Master_Port: 3306
                          Connect_Retry: 60
                        Master_Log_File: mysql-bin.000045
                    Read_Master_Log_Pos: 107
                         Relay_Log_File: ha-mgr01-relay-bin.000002
                          Relay_Log_Pos: 253
                  Relay_Master_Log_File: mysql-bin.000045
                       Slave_IO_Running: Yes
                      Slave_SQL_Running: Yes
                        Replicate_Do_DB: 
                    Replicate_Ignore_DB: 
                     Replicate_Do_Table: 
                 Replicate_Ignore_Table: 
                Replicate_Wild_Do_Table: 
            Replicate_Wild_Ignore_Table: 
                             Last_Errno: 0
                             Last_Error: 
                           Skip_Counter: 0
                    Exec_Master_Log_Pos: 107
                        Relay_Log_Space: 412
                        Until_Condition: None
                         Until_Log_File: 
                          Until_Log_Pos: 0
                     Master_SSL_Allowed: No
                     Master_SSL_CA_File: 
                     Master_SSL_CA_Path: 
                        Master_SSL_Cert: 
                      Master_SSL_Cipher: 
                         Master_SSL_Key: 
                  Seconds_Behind_Master: 0
          Master_SSL_Verify_Server_Cert: No
                          Last_IO_Errno: 0
                          Last_IO_Error: 
                         Last_SQL_Errno: 0
                         Last_SQL_Error: 
            Replicate_Ignore_Server_Ids: 
                       Master_Server_Id: 100 
          
          • Setting up the Global configuration file
          # mkdir -p /var/log/masterha/app1
          # cat > /etc/masterha_default.cnf <<EOF
          [server default]
          user=root
          password=mysql
          ssh_user=root
          master_binlog_dir=/var/lib/mysql
          remote_workdir=/tmp/masterha
          secondary_check_script=/usr/local/bin/masterha_secondary_check -s ha-db02 --user=root --master_ip=192.168.100.197 --master_port=3306 --master_host=ha-db01 -s ha-db01
          ping_interval=3
          #master_ip_failover_script=/usr/local/bin/master_ip_failover
          #master_ip_online_change_script=/usr/local/bin/master_ip_online_change
          #shutdown_script=/usr/local/bin/power_manager
          report_script=/usr/local/bin/send_report
          EOF
          
          • Setting up the Application configuration file
          # cat > /etc/app1.cnf <<EOF
          [server default]
          user=root
          password=mysql
          manager_workdir=/var/log/masterha/app1
          manager_log=/var/log/masterha/app1/app1.log
          remote_workdir=/var/log/masterha/app1
          [server1]
          hostname=ha-db01
          [server2]
          hostname=ha-db02
          [server2]
          hostname=ha-mgr01
          EOF
          
          • Checking to connect via ssh with MHA Manager and Nodes
          # masterha_check_ssh --conf=/etc/app1.cnf 
          Thu Jul 28 10:32:27 2011 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
          Thu Jul 28 10:32:27 2011 - [info] Reading application default configurations from /etc/app1.cnf..
          Thu Jul 28 10:32:27 2011 - [info] Reading server configurations from /etc/app1.cnf..
          Thu Jul 28 10:32:27 2011 - [info] Starting SSH connection tests..
          Thu Jul 28 10:32:28 2011 - [debug] 
          Thu Jul 28 10:32:27 2011 - [debug]  Connecting via SSH from root@ha-db01(192.168.100.197) to root@ha-db02(192.168.100.198)..
          Thu Jul 28 10:32:28 2011 - [debug]   ok.
          Thu Jul 28 10:32:28 2011 - [debug]  Connecting via SSH from root@ha-db01(192.168.100.197) to root@ha-mgr02(192.168.100.200)..
          Thu Jul 28 10:32:28 2011 - [debug]   ok.
          Thu Jul 28 10:32:29 2011 - [debug] 
          Thu Jul 28 10:32:28 2011 - [debug]  Connecting via SSH from root@ha-db02(192.168.100.198) to root@ha-db01(192.168.100.197)..
          Thu Jul 28 10:32:28 2011 - [debug]   ok.
          Thu Jul 28 10:32:28 2011 - [debug]  Connecting via SSH from root@ha-db02(192.168.100.198) to root@ha-mgr02(192.168.100.200)..
          Thu Jul 28 10:32:28 2011 - [debug]   ok.
          Thu Jul 28 10:32:29 2011 - [debug] 
          Thu Jul 28 10:32:28 2011 - [debug]  Connecting via SSH from root@ha-mgr02(192.168.100.200) to root@ha-db01(192.168.100.197)..
          Thu Jul 28 10:32:29 2011 - [debug]   ok.
          Thu Jul 28 10:32:29 2011 - [debug]  Connecting via SSH from root@ha-mgr02(192.168.100.200) to root@ha-db02(192.168.100.198)..
          Thu Jul 28 10:32:29 2011 - [debug]   ok.
          Thu Jul 28 10:32:29 2011 - [info] All SSH connection tests passed successfully.
          
          • Checking if MySQL replication is enabled
          # masterha_check_repl --conf=/etc/app1.cnf 
          Thu Jul 28 10:32:39 2011 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
          Thu Jul 28 10:32:39 2011 - [info] Reading application default configurations from /etc/app1.cnf..
          Thu Jul 28 10:32:39 2011 - [info] Reading server configurations from /etc/app1.cnf..
          Thu Jul 28 10:32:39 2011 - [info] MHA::MasterMonitor version 0.50.
          Thu Jul 28 10:32:39 2011 - [info] Dead Servers:
          Thu Jul 28 10:32:39 2011 - [info] Alive Servers:
          Thu Jul 28 10:32:39 2011 - [info]   ha-db01(192.168.100.197:3306)
          Thu Jul 28 10:32:39 2011 - [info]   ha-db02(192.168.100.198:3306)
          Thu Jul 28 10:32:39 2011 - [info]   ha-mgr02(192.168.100.200:3306)
          Thu Jul 28 10:32:39 2011 - [info] Alive Slaves:
          Thu Jul 28 10:32:39 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:32:39 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:32:39 2011 - [info]   ha-mgr02(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:32:39 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:32:39 2011 - [info] Current Master: ha-db01(192.168.100.197:3306)
          Thu Jul 28 10:32:39 2011 - [info] Checking slave configurations..
          Thu Jul 28 10:32:39 2011 - [warn]  read_only=1 is not set on slave ha-db02(192.168.100.198:3306).
          Thu Jul 28 10:32:39 2011 - [warn]  relay_log_purge=0 is not set on slave ha-db02(192.168.100.198:3306).
          Thu Jul 28 10:32:39 2011 - [warn]  read_only=1 is not set on slave ha-mgr02(192.168.100.200:3306).
          Thu Jul 28 10:32:39 2011 - [warn]  relay_log_purge=0 is not set on slave ha-mgr02(192.168.100.200:3306).
          Thu Jul 28 10:32:39 2011 - [info] Checking replication filtering settings..
          Thu Jul 28 10:32:39 2011 - [info]  binlog_do_db= , binlog_ignore_db= 
          Thu Jul 28 10:32:39 2011 - [info]  Replication filtering check ok.
          Thu Jul 28 10:32:39 2011 - [info] Starting SSH connection tests..
          Thu Jul 28 10:32:41 2011 - [info] All SSH connection tests passed successfully.
          Thu Jul 28 10:32:41 2011 - [info] Checking MHA Node version..
          Thu Jul 28 10:32:41 2011 - [info]  Version check ok.
          Thu Jul 28 10:32:41 2011 - [info] Checking SSH publickey authentication and checking recovery script configurations on the current master..
          Thu Jul 28 10:32:42 2011 - [info]   Executing command: save_binary_logs --command=test --start_file=mysql-bin.000045 --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.50 
          Thu Jul 28 10:32:42 2011 - [info]   Connecting to root@ha-db01(ha-db01).. 
            Creating /var/log/masterha/app1 if not exists..    ok.
            Checking output directory is accessible or not..
             ok.
            Binlog found at /var/lib/mysql, up to mysql-bin.000045
          Thu Jul 28 10:32:42 2011 - [info] Master setting check done.
          Thu Jul 28 10:32:42 2011 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
          Thu Jul 28 10:32:42 2011 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=ha-db02 --slave_ip=192.168.100.198 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --manager_version=0.50 --relay_log_info=/var/lib/mysql/relay-log.info  --slave_pass=xxx
          Thu Jul 28 10:32:42 2011 - [info]   Connecting to root@192.168.100.198(ha-db02).. 
            Checking slave recovery environment settings..
              Opening /var/lib/mysql/relay-log.info ... ok.
              Relay log found at /var/lib/mysql, up to mysqld-relay-bin.000017
              Temporary relay log file is /var/lib/mysql/mysqld-relay-bin.000017
              Testing mysql connection and privileges.. done.
              Testing mysqlbinlog output.. done.
              Cleaning up test file(s).. done.
          Thu Jul 28 10:32:42 2011 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=ha-mgr02 --slave_ip=192.168.100.200 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --manager_version=0.50 --relay_log_info=/var/lib/mysql/relay-log.info  --slave_pass=xxx
          Thu Jul 28 10:32:42 2011 - [info]   Connecting to root@192.168.100.200(ha-mgr02).. 
            Checking slave recovery environment settings..
              Opening /var/lib/mysql/relay-log.info ... ok.
              Relay log found at /var/lib/mysql, up to ha-mgr02-relay-bin.000002
              Temporary relay log file is /var/lib/mysql/ha-mgr02-relay-bin.000002
              Testing mysql connection and privileges.. done.
              Testing mysqlbinlog output.. done.
              Cleaning up test file(s).. done.
          Thu Jul 28 10:32:42 2011 - [info] Slaves settings check done.
          Thu Jul 28 10:32:42 2011 - [info] 
          ha-db01 (current master)
           +--ha-db02
           +--ha-mgr02
          Thu Jul 28 10:32:42 2011 - [info] Checking replication health on ha-db02..
          Thu Jul 28 10:32:42 2011 - [info]  ok.
          Thu Jul 28 10:32:42 2011 - [info] Checking replication health on ha-mgr02..
          Thu Jul 28 10:32:42 2011 - [info]  ok.
          Thu Jul 28 10:32:42 2011 - [warn] master_ip_failover_script is not defined.
          Thu Jul 28 10:32:42 2011 - [warn] shutdown_script is not defined.
          Thu Jul 28 10:32:42 2011 - [info] Got exit code 0 (Not master dead).
          MySQL Replication Health is OK. 
          
          • Starting up the Node Manager
          # masterha_manager --conf=/etc/app1.cnf 
          • Checking the status 
          # masterha_check_status --conf=/etc/app1.cnf 
          • Stopping MySQL Daemon
          # ssh ha-db01 '/etc/init.d/mysql stop' 
          • Changing Master Server ha-db01 to ha-db02 manually
          # masterha_master_switch --master_state=dead --conf=/etc/app1.cnf --dead_master_host=ha-db01 --new_master_host=ha-db02
          --dead_master_ip=<dead_master_ip> is not set. Using 192.168.100.197.
          --dead_master_port=<dead_master_port> is not set. Using 3306.
          Thu Jul 28 10:39:09 2011 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
          Thu Jul 28 10:39:09 2011 - [info] Reading application default configurations from /etc/app1.cnf..
          Thu Jul 28 10:39:09 2011 - [info] Reading server configurations from /etc/app1.cnf..
          Thu Jul 28 10:39:09 2011 - [info] MHA::MasterFailover version 0.50.
          Thu Jul 28 10:39:09 2011 - [info] Starting master failover.
          Thu Jul 28 10:39:09 2011 - [info] 
          Thu Jul 28 10:39:09 2011 - [info] * Phase 1: Configuration Check Phase..
          Thu Jul 28 10:39:09 2011 - [info] 
          Thu Jul 28 10:39:09 2011 - [info] Dead Servers:
          Thu Jul 28 10:39:09 2011 - [info]   ha-db01(192.168.100.197:3306)
          Thu Jul 28 10:39:09 2011 - [info] Checking master reachability via mysql(double check)..
          Thu Jul 28 10:39:09 2011 - [info]  ok.
          Thu Jul 28 10:39:09 2011 - [info] Alive Servers:
          Thu Jul 28 10:39:09 2011 - [info]   ha-db02(192.168.100.198:3306)
          Thu Jul 28 10:39:09 2011 - [info]   ha-mgr01(192.168.100.200:3306)
          Thu Jul 28 10:39:09 2011 - [info] Alive Slaves:
          Thu Jul 28 10:39:09 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:09 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:39:09 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:09 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Master ha-db01 is dead. Proceed? (yes/NO): yes
          Thu Jul 28 10:39:19 2011 - [info] ** Phase 1: Configuration Check Phase completed.
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] * Phase 2: Dead Master Shutdown Phase..
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] HealthCheck: SSH to ha-db01 is reachable.
          Thu Jul 28 10:39:19 2011 - [info] Forcing shutdown so that applications never connect to the current master..
          Thu Jul 28 10:39:19 2011 - [warn] master_ip_failover_script is not set. Skipping invalidating dead master ip address.
          Thu Jul 28 10:39:19 2011 - [warn] shutdown_script is not set. Skipping explicit shutting down of the dead master.
          Thu Jul 28 10:39:19 2011 - [info] * Phase 2: Dead Master Shutdown Phase completed.
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] * Phase 3: Master Recovery Phase..
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] * Phase 3.1: Getting Latest Slaves Phase..
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] The latest binary log file/position on all slaves is mysql-bin.000045:107
          Thu Jul 28 10:39:19 2011 - [info] Latest slaves (Slaves that received relay log files to the latest):
          Thu Jul 28 10:39:19 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:19 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:39:19 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:19 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:39:19 2011 - [info] The oldest binary log file/position on all slaves is mysql-bin.000045:107
          Thu Jul 28 10:39:19 2011 - [info] Oldest slaves:
          Thu Jul 28 10:39:19 2011 - [info]   ha-db02(192.168.100.198:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:19 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:39:19 2011 - [info]   ha-mgr01(192.168.100.200:3306)  Version=5.5.12-log (oldest major version between slaves) log-bin:enabled
          Thu Jul 28 10:39:19 2011 - [info]     Replicating from 192.168.100.197(192.168.100.197:3306)
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
          Thu Jul 28 10:39:19 2011 - [info] 
          Thu Jul 28 10:39:19 2011 - [info] Fetching dead master's binary logs..
          Thu Jul 28 10:39:19 2011 - [info] Executing command on the dead master ha-db01(192.168.100.197:3306): save_binary_logs --command=save --start_file=mysql-bin.000045  --start_pos=107 --binlog_dir=/var/lib/mysql --output_file=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50
            Creating /var/log/masterha/app1 if not exists..    ok.
           Concat binary/relay logs from mysql-bin.000045 pos 107 to mysql-bin.000045 EOF into /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog ..
            Dumping binlog format description event, from position 0 to 107.. ok.
            Dumping effective binlog data from /var/lib/mysql/mysql-bin.000045 position 107 to tail(126).. ok.
           Concat succeeded.
          saved_master_binlog_from_ha-db01_3306_20110728103909.binlog                                    100%  126     0.1KB/s   00:00    
          Thu Jul 28 10:39:20 2011 - [info] scp from root@192.168.100.197:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog to local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog succeeded.
          Thu Jul 28 10:39:20 2011 - [info] HealthCheck: SSH to ha-db02 is reachable.
          Thu Jul 28 10:39:21 2011 - [info] HealthCheck: SSH to ha-mgr01 is reachable.
          Thu Jul 28 10:39:21 2011 - [info] 
          Thu Jul 28 10:39:21 2011 - [info] * Phase 3.3: Determining New Master Phase..
          Thu Jul 28 10:39:21 2011 - [info] 
          Thu Jul 28 10:39:21 2011 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
          Thu Jul 28 10:39:21 2011 - [info] All slaves received relay logs to the same position. No need to resync each other.
          Thu Jul 28 10:39:21 2011 - [info] ha-db02 can be new master.
          Thu Jul 28 10:39:21 2011 - [info] New master is ha-db02(192.168.100.198:3306)
          Thu Jul 28 10:39:21 2011 - [info] Starting master failover..
          Thu Jul 28 10:39:21 2011 - [info] 
          From:
          ha-db01 (current master)
           +--ha-db02
           +--ha-mgr01
          To:
          ha-db02 (new master)
           +--ha-mgr01
          Starting master switch from ha-db01(192.168.100.197:3306) to ha-db02(192.168.100.198:3306)? (yes/NO): yes
          Thu Jul 28 10:39:29 2011 - [info] New master decided manually is ha-db02(192.168.100.198:3306)
          Thu Jul 28 10:39:29 2011 - [info] 
          Thu Jul 28 10:39:29 2011 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
          Thu Jul 28 10:39:29 2011 - [info] 
          Thu Jul 28 10:39:29 2011 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
          Thu Jul 28 10:39:29 2011 - [info] Sending binlog..
          saved_master_binlog_from_ha-db01_3306_20110728103909.binlog                                    100%  126     0.1KB/s   00:00    
          Thu Jul 28 10:39:30 2011 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog to root@ha-db02:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog succeeded.
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] * Phase 3.4: Master Log Apply Phase..
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
          Thu Jul 28 10:39:30 2011 - [info] Starting recovery on ha-db02(192.168.100.198:3306)..
          Thu Jul 28 10:39:30 2011 - [info]  Generating diffs succeeded.
          Thu Jul 28 10:39:30 2011 - [info] Waiting until all relay logs are applied.
          Thu Jul 28 10:39:30 2011 - [info]  done.
          Thu Jul 28 10:39:30 2011 - [info] Getting slave status..
          Thu Jul 28 10:39:30 2011 - [info] This slave(ha-db02)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000045:107). No need to recover from Exec_Master_Log_Pos.
          Thu Jul 28 10:39:30 2011 - [info] Connecting to the target slave host ha-db02, running recover script..
          Thu Jul 28 10:39:30 2011 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=ha-db02 --slave_ip=192.168.100.198  --slave_port=3306 --apply_files=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --timestamp=20110728103909 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50 --slave_pass=xxx
          Thu Jul 28 10:39:30 2011 - [info] 
          Applying differential binary/relay log files /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog on ha-db02:3306. This may take long time...
          Applying log files succeeded.
          Thu Jul 28 10:39:30 2011 - [info]  All relay logs were successfully applied.
          Thu Jul 28 10:39:30 2011 - [info] Getting new master's binlog name and position..
          Thu Jul 28 10:39:30 2011 - [info]  mysql-bin.000006:107
          Thu Jul 28 10:39:30 2011 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='ha-db02 or 192.168.100.198', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000006', MASTER_LOG_POS=107, MASTER_USER='replication', MASTER_PASSWORD='xxx';
          Thu Jul 28 10:39:30 2011 - [warn] master_ip_failover_script is not set. Skipping taking over new master ip address.
          Thu Jul 28 10:39:30 2011 - [info] ** Finished master recovery successfully.
          Thu Jul 28 10:39:30 2011 - [info] * Phase 3: Master Recovery Phase completed.
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] * Phase 4: Slaves Recovery Phase..
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] -- Slave diff file generation on host ha-mgr01(192.168.100.200:3306) started, pid: 18983. Check tmp log /var/log/masterha/app1/ha-mgr02_3306_20110728103909.log if it takes time..
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] Log messages from ha-mgr01 ...
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
          Thu Jul 28 10:39:30 2011 - [info] End of log messages from ha-mgr01.
          Thu Jul 28 10:39:30 2011 - [info] -- ha-mgr01(192.168.100.200:3306) has the latest relay log events.
          Thu Jul 28 10:39:30 2011 - [info] Generating relay diff files from the latest slave succeeded.
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
          Thu Jul 28 10:39:30 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] -- Slave recovery on host ha-mgr01(192.168.100.200:3306) started, pid: 18985. Check tmp log /var/log/masterha/app1/ha-mgr02_3306_20110728103909.log if it takes time..
          saved_master_binlog_from_ha-db01_3306_20110728103909.binlog                                    100%  126     0.1KB/s   00:00    
          Thu Jul 28 10:39:31 2011 - [info] 
          Thu Jul 28 10:39:31 2011 - [info] Log messages from ha-mgr01 ...
          Thu Jul 28 10:39:31 2011 - [info] 
          Thu Jul 28 10:39:30 2011 - [info] Sending binlog..
          Thu Jul 28 10:39:31 2011 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog to root@ha-mgr01:/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog succeeded.
          Thu Jul 28 10:39:31 2011 - [info] Starting recovery on ha-mgr01(192.168.100.200:3306)..
          Thu Jul 28 10:39:31 2011 - [info]  Generating diffs succeeded.
          Thu Jul 28 10:39:31 2011 - [info] Waiting until all relay logs are applied.
          Thu Jul 28 10:39:31 2011 - [info]  done.
          Thu Jul 28 10:39:31 2011 - [info] Getting slave status..
          Thu Jul 28 10:39:31 2011 - [info] This slave(ha-mgr01)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000045:107). No need to recover from Exec_Master_Log_Pos.
          Thu Jul 28 10:39:31 2011 - [info] Connecting to the target slave host ha-mgr01, running recover script..
          Thu Jul 28 10:39:31 2011 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=ha-mgr01 --slave_ip=192.168.100.200  --slave_port=3306 --apply_files=/var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog --workdir=/var/log/masterha/app1 --target_version=5.5.12-log --timestamp=20110728103909 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.50 --slave_pass=xxx
          Thu Jul 28 10:39:31 2011 - [info] 
          Applying differential binary/relay log files /var/log/masterha/app1/saved_master_binlog_from_ha-db01_3306_20110728103909.binlog on ha-mgr01:3306. This may take long time...
          Applying log files succeeded.
          Thu Jul 28 10:39:31 2011 - [info]  All relay logs were successfully applied.
          Thu Jul 28 10:39:31 2011 - [info]  Resetting slave ha-mgr01(192.168.100.200:3306) and starting replication from the new master ha-db02(192.168.100.198:3306)..
          Thu Jul 28 10:39:31 2011 - [info]  Executed CHANGE MASTER.
          Thu Jul 28 10:39:31 2011 - [info]  Slave started.
          Thu Jul 28 10:39:31 2011 - [info] End of log messages from ha-mgr01.
          Thu Jul 28 10:39:31 2011 - [info] -- Slave recovery on host ha-mgr01(192.168.100.200:3306) succeeded.
          Thu Jul 28 10:39:31 2011 - [info] All new slave servers recovered successfully.
          Thu Jul 28 10:39:31 2011 - [info] 
          Thu Jul 28 10:39:31 2011 - [info] * Phase 5: New master cleanup phease..
          Thu Jul 28 10:39:31 2011 - [info] 
          Thu Jul 28 10:39:31 2011 - [info] Resetting slave info on the new master..
          Thu Jul 28 10:39:31 2011 - [info] Master failover to ha-db02(192.168.100.198:3306) completed successfully.
          Thu Jul 28 10:39:31 2011 - [info] 
          ----- Failover Report -----
          app1: MySQL Master failover ha-db01 to ha-db02 succeeded
          Master ha-db01 is down!
          
          • Check MHA Manager logs at ha-mgr01 for details.
          
          Started manual(interactive) failover.
          The latest slave ha-db02(192.168.100.198:3306) has all relay logs for recovery.
          Selected ha-db02 as a new master.
          ha-db02: OK: Applying all logs succeeded.
          ha-mgr01: This host has the latest relay log events.
          Generating relay diff files from the latest slave succeeded.
          ha-mgr01: OK: Applying all logs succeeded. Slave started, replicating from ha-db02.
          ha-db02: Resetting slave info succeeded.
          Master failover to ha-db02(192.168.100.198:3306) completed successfully.
          Thu Jul 28 10:39:31 2011 - [info] Sending mail..
          Unknown option: conf
          
          • Displaying a list of replication slaves registered with new Master
          # mysql -uroot -pmysql -e 'SHOW SLAVE HOSTS\G' -h ha-db02
          *************************** 1. row ***************************
          Server_id: 300
               Host: 
               Port: 3306
          Master_id: 200 
          
          • Checking the replication status of the slave servers
          # mysql -uroot -pmysql -e 'SHOW SLAVE STATUS\G' -h localhost
          *************************** 1. row ***************************
                         Slave_IO_State: Waiting for master to send event
                            Master_Host: 192.168.100.198
                            Master_User: replication
                            Master_Port: 3306
                          Connect_Retry: 60
                        Master_Log_File: mysql-bin.000006
                    Read_Master_Log_Pos: 107
                         Relay_Log_File: ha-mgr01-relay-bin.000002
                          Relay_Log_Pos: 253
                  Relay_Master_Log_File: mysql-bin.000006
                       Slave_IO_Running: Yes
                      Slave_SQL_Running: Yes
                        Replicate_Do_DB: 
                    Replicate_Ignore_DB: 
                     Replicate_Do_Table: 
                 Replicate_Ignore_Table: 
                Replicate_Wild_Do_Table: 
            Replicate_Wild_Ignore_Table: 
                             Last_Errno: 0
                             Last_Error: 
                           Skip_Counter: 0
                    Exec_Master_Log_Pos: 107
                        Relay_Log_Space: 412
                        Until_Condition: None
                         Until_Log_File: 
                          Until_Log_Pos: 0
                     Master_SSL_Allowed: No
                     Master_SSL_CA_File: 
                     Master_SSL_CA_Path: 
                        Master_SSL_Cert: 
                      Master_SSL_Cipher: 
                         Master_SSL_Key: 
                  Seconds_Behind_Master: 0
          Master_SSL_Verify_Server_Cert: No
                          Last_IO_Errno: 0
                          Last_IO_Error: 
                         Last_SQL_Errno: 0
                         Last_SQL_Error: 
            Replicate_Ignore_Server_Ids: 
                       Master_Server_Id: 200
          
          # mysql -uroot -pmysql -e 'SHOW SLAVE STATUS\G' -h ha-db02
          *************************** 1. row ***************************
                         Slave_IO_State: 
                            Master_Host: 192.168.100.197
                            Master_User: replication
                            Master_Port: 3306
                          Connect_Retry: 60
                        Master_Log_File: 
                    Read_Master_Log_Pos: 4
                         Relay_Log_File: mysqld-relay-bin.000001
                          Relay_Log_Pos: 4
                  Relay_Master_Log_File: 
                       Slave_IO_Running: No
                      Slave_SQL_Running: No
                        Replicate_Do_DB: 
                    Replicate_Ignore_DB: 
                     Replicate_Do_Table: 
                 Replicate_Ignore_Table: 
                Replicate_Wild_Do_Table: 
            Replicate_Wild_Ignore_Table: 
                             Last_Errno: 0
                             Last_Error: 
                           Skip_Counter: 0
                    Exec_Master_Log_Pos: 0
                        Relay_Log_Space: 126
                        Until_Condition: None
                         Until_Log_File: 
                          Until_Log_Pos: 0
                     Master_SSL_Allowed: No
                     Master_SSL_CA_File: 
                     Master_SSL_CA_Path: 
                        Master_SSL_Cert: 
                      Master_SSL_Cipher: 
                         Master_SSL_Key: 
                  Seconds_Behind_Master: NULL
          Master_SSL_Verify_Server_Cert: No
                          Last_IO_Errno: 0
                          Last_IO_Error: 
                         Last_SQL_Errno: 0
                         Last_SQL_Error: 
            Replicate_Ignore_Server_Ids: 
                       Master_Server_Id: 100
          

          Lastly, let's see how master server fails over automatically.