If you have a hardware fault or other calamity in a HP-UX serviceguard cluster you lose the ability to make incremental changes to the cluster until that node comes back.
If you need to make a change to a cluster in this state and you don’t want to bring down the cluster, you have to do all your changes with one gigantic command line.
Lets say you have a 4 node cluster named cnode1,cnode2,cnode3, and cnode4.
cnode4 suffers a hardware fault and you packages fail over to cnode1-3. But your usage has grown and you have a package that is beating the hardware down and you want to move it from cnode2 to cnode1.
Well you can’t do it incrementally. You have to do it all at once. I recently ran into a situation where I had to modify 37 cluster environment files and the cluster configuration to remove a node cnode4.
That requires you to correctly type a command line that could easily be in excess of 4000 characters. Anybody who knows my typing skills knows this is beyond my abilities on my best day.
So I wrote a little assistant program.
It consists of three files two of which are scripts.
pkg-mod-list (A list of all the package configuration files, full path that need to be modified. It is your choice how to handle the editing. We used ansible last night when we did it in a DR cluster.
Contents …
/etc/cmcluster/nc-package-name/nc-package-name.env
/etc/cmcluster/sc-package-name/sc-package-name.env
Then we have helper scripts which put the command line together.
myclusterV6_prod.conf is the main cluster configuration file with the references to node cnode4 commented out.
cat missing-node-checkconf
MAIN=”cmcheckconf -C /etc/cmcluster/configs/myclusterV6_Prod.conf”
PCMD=””
cat pkg-mod-list | while read -r pfile
do
PCMD=”${PCMD} -P ${pfile}”
### echo “$PCMD”
done
MYCMD=”${MAIN} ${PCMD}”
echo $MYCMD
exec ${MYCMD}
MAIN=”cmapplyconf -C /etc/cmcluster/configs/myclusterV6_Prod.conf”
PCMD=””
cat pkg-mod-list | while read -r pfile
do
PCMD=”${PCMD} -P ${pfile}”
### echo “$PCMD”
done
MYCMD=”${MAIN} ${PCMD}”
echo $MYCMD
exec ${MYCMD}