Skip to content

Computational Chemistry Helpers

October 14, 2011

For the past couple of months, I have been in the process of becoming our group’s computational guy. It has been quite fun (computers and chemistry!), but the work flow has been…interesting.

For example, our group does a lot of radical work, which means spin-density calculations are a common task for me. This requires the cubegen utility when using Gaussian, which takes a number of non-intuitive arguments that don’t change between runs. For a while, I was grep’ing it from my shell history, and editing the file it was to work on. But, I found this annoying, which spawned the following script on a slow day. And because it was a slow day, I went all out.

# just a shortcut to avoid copying and pasting this stupid command over and over and over again


if [ ${filename#*\.} != "fchk" ] ; then
    echo "File not a formatted checkpoint file" 1>&2
    if [ ${filename#*\.} = "chk" ] ; then
	echo "File is a binary checkpoint file. Converting"  1>&2
	formchk ${filename} "${filename%\.chk}.fchk"

if [ ${filename#*\.} = "fchk" ] ; then
    cubegen 0 Spin "${filename}" "${filename%\.fchk}-spin.cube" 0 h

At this point, I was also throttling our “super computer” eating up all the file descriptors and causing some other group members calculations to fail… So! instead of just trying to hold myself back and remember which input files I was thinking of running, I threw together this quick script to run a bunch of jobs one by one (originally, this was just a close clone to the one the Gaussian manual lists).


logfile="gaussrun-`date +%m-%d-%y`.log"

echo -e "\nStarting run on `date`:" | tee -a $logfile 1>&2
echo " Running Jobs:" | tee -a $logfile 1>&2
for file in $@ ; do
    if [ -e $file -a "${file#*\.}" = "com" ] ; then
	echo "  $file" | tee -a $logfile 1>&2
	echo "   Error: $file does not exist or is not a Gaussian input file" | tee -a $logfile 1>&2
	exit 1
echo -e "\n" | tee -a $logfile 1>&2
for file in $@ ; do
   echo -e "    -------\n    Starting file $file at `date`" | tee -a $logfile 1>&2
   g09  "${file%\.com}.log"
   echo -e "    $file Done with status $?" | tee -a $logfile 1>&2
echo -e "All Done.\n\n" | tee -a $logfile 1>&2

In all honesty though, I am not happy with this one. It’s too static for my tastes. If I want to queue up more jobs, I can’t. I was playing with named pipes and also at/batch, but there wasn’t an easy/obvious way for me to easily add jobs and also be able to look/be told/edit what was in the queue. The qsub command would do just what I wanted it to do, but that requires TORQUE or some other package that isn’t in the repos, and would require asking my advisor to install this package manually. (On a side note, this is the first computer I have used extensively that I do not have root permissions on… I feel so limited T_T )

Lastly, this is the script I am happy with. The problem was trying to setup ONIOM calculations (think of it like an onion, you can use different levels of theory on different shells of the system). For each atom in the system, you need to specify what level of theory to use (for a simple 2 level, you need H for High and L for low), and also you need to specify link atoms (what atoms to replace on the boundry of a shell when doing the higher calculations). None of the programs I use to construct the XYZ input coordinates had any extensions to help me generate these files, and I didn’t look forward to painstakingly editing these 300+ atom files by hand. Luckily, I did find out that using Avogadro, if you copy a selection of atoms and paste into a text editor, it pastes an XYZ formatted list of those atoms you selected. At this point it was just some very clever (and painstaking) construction of proper regexp and sed’ing. I then made a nice script to streamline and error-resist the process (as I may have to show other members how to do these calculations…)

# Formats a gaussian input file for ONIOM calculations
# Requires three files
#   1) The raw gaussin com file (cartesian only for now...)
#   2) An xyz file of all the high level/QM atoms
#   3) An xyz of all link atoms/LA (for the boundry between high and low level, for now assumes H replacement)

if [ ${#} -eq 0 ] ; then
    echo 'Usage: comfile [[QM File] [LA file]]'
    echo '    comfile is the raw gaussian input file'
    echo '    QM file is the xyz file of all high level atoms (default is ${cbasename}'
    echo '    LA file is the xyz file of all link atoms (default is ${basename}'
    exit 0


if [ ! -e $comfile ] ; then
    echo "File does not exist" 1>&2
    exit 1

if [ ${comfile#*\.} != "com" ] ; then
    echo "File is not a gaussian input file" 1>&2
    exit 1

#directory to use for temp files


if [ -e "${basename}" ] ; then
elif [ ${#} -ge 2 -a -e "${2}" ] ; then
    echo "No QM file found/specified" 1>&2
    exit 1

if [ -e "${basename}" ] ; then
elif [ ${#} -ge 3 -a -e "${3}" ] ; then
    echo "No LA file found/specified" 1>&2
    exit 1

#copy .com file to /tmp and work on that, copy back here when done
cp $comfile $tmpfile

#Append all lines matching a atom coordinate with an L (low level)
sed -ri 's/[A-Z][a-z]?( +\-?[0-9]+\.[0-9]+){3}/& L/' $tmpfile

#For all lines in the QM file (matching a molecules coordinate), replace the L with an H
for line in `grep -E '[A-Z][a-z]?( +\-?[0-9]+\.[0-9]+){3}' "${qmfile}"` ; do
    sed -ri 's/('${line}') L/\1 H/' $tmpfile
#For all lines in the LA file (matching a molecules coordinate), append an H
for line in `grep -E '[A-Z][a-z]?( +\-?[0-9]+\.[0-9]+){3}' "${lafile}"` ; do
    sed -ri 's/'${line}' L/& H/' $tmpfile

#by this point, should be sucessful, move the tmpfile back
mv $tmpfile $comfile

I am not a fan of shell-scripting and had to [re-]look most of this up, so any suggestions would be welcomed.


From → Howto

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s