# 10. Molecular data features¶

The basics of Molecular data visualization in VisIt are found in the Molecule Plot, the Create Bonds operator and the Replicate operator.

## 10.2. Other plots and operators¶

The following images show plots and operators you might use to explore your data apart from the Molecule Plot and related operators. These examples show charge density and force vectors associated with the raw molecular positions and species, all combined in the same window as a Molecule Plot.

## 10.3. Analysis Capabilities¶

### 10.3.1. Subset Selection¶

The screenshot in Figure 10.17 shows the same plot in two windows, but with different subset selection. The top image shows the standard Molecule plot of a data set. The bottom shows the Molecule Plot, but with the “Subset” set to de-select Oxygen atoms.

Various file format readers may present a different set of subsets to the user through VisIt. For example, the Protein Data Bank reader presents compounds, residues, and atom type. The VASP reader presents only the atom type, but is smart enough to restrict the choice to only those elements actually present in the file (while the PDB reader presents all 100+ element types).

### 10.3.2. Atomic Color Tables¶

VisIt includes a variety of color tables, some for continuous variables and some for discrete variables. For molecular plots, such as ones coloring atoms by their species, VisIt includes color tables which match up with residue types or atomic numbers and have similar colors to conventional ones used. The ones included with VisIt for atomic numbers are called “cpk_rasmol” and “cpk_jmol”, and for residue types are “amino_rasmol” and “amino_shapely”.

However, you can also create your own. The easiest way is to start by selecting one of these, typing a new name, e.g. “my_atom_colors”, and clicking the New button. This makes a copy of the selected color table with the new name. You can then edit the colors at will, and when you Save Settings (in the Options menu), it will keep your new color tables in future sessions.

Note that in Figure 10.18, you see one of the features of the color table editor for atomic data, which is to provide hint labels for the colors in the grid. Normally these are displayed as numbers, but for atomic color tables it will display the element’s symbol instead. Note: VisIt assumes if the number of colors matches what is in the provided atomic number color tables (which is 110) that it is an atomic color table. So make sure if you’re creating a new atomic color table to create one with the correct number of color values.

### 10.3.3. Expressions¶

#### 10.3.3.1. Basic Expression Support¶

Numeric expressions, created in VisIt’s Expressions window, are compatible with molecular data types. For example, if one created the variable “zcoord” as a Scalar, defined as “coords(mesh)[2]” (where “mesh” is the name of the mesh in your data file containing the atomic data), then it will create a new value, centered at the atoms, of the value of the Z coordinate of the atoms.

#### 10.3.3.2. Enumerate Expression¶

One useful expression for some molecular data files is the Enumerate Expression. The most common use case is if your data file contains only a species type index, such as {0, 1, 2, etc.}, but does not have support for mapping this index to an actual atomic number. In this case, some molecular operations in VisIt, which require an atomic number (often called “element”), will not work. In this case, you can use the Enumerate Expression to map, e.g. “0” to “14” (Si), “1” to “80” (Hg), etc. Typically you want to call this new scalar variable “element” as this is the convention VisIt follows by default for this variable (though in some plots/operators you can specify a different one).

For example, the LAMMPS readers and VASP POSCAR reader do not have intrinsic knowledge of which type of atom in the file maps to which atomic number – but they do report the atom type (0,1,2…) as a variable called “species”. To enable the VisIt features which use atomic number, define a new expression, called “element”, of type “Scalar Mesh Variable”, with the definition “enumerate(species, [14,80,8])”, which maps the first type to Si, the second to Hg, and the third to O.

### 10.3.4. Enhanced Rendering¶

#### 10.3.4.1. Plot Quality¶

Most plots have a number of options which can increase their quality at the cost of performance. Some examples follow.

#### 10.3.4.4. Annotations¶

The example in Figure 10.25 shows the same plot before and after modifying various annotation properties, such as:

• switching to a darker, gradient background
• turning off the 3D bounding box, coordinate axes, and triad
• disabling database and user information
• moving the legend, changing its orientation and size
• adding a time slider progress bar, and text showing the time value

#### 10.3.4.5. File Export¶

VisIt has the ability to save windows, not just as image formats like PNG and JPEG, but as data files which can be imported into other tools. Some of these data types can be imported back into VisIt or other visualization and rendering tools which might have different rendering features of interest for making renderings.

#### 10.3.4.6. POV-Ray¶

One of the exportable data file types in VisIt, after composing your plots in VisIt, is a set of POV-Ray scene description files, which are commented and composed in a manner intended to be tweakable by users to achieve results better than what one could get with a real-time rendering tool. See below for an example.

### 10.3.5. Data File Formats¶

VisIt contains readers for over 100 different scientific, code-specific, and other general file formats. Below are listed several of the most specific to molecular data.

Note that many of these formats have lax restrictions on naming, and VisIt may not automatically detect the file type. To force VisIt to try your desired file reader (as listed in quotation marks in the section header below), use that reader’s name as the input to the “-assume_format” command when launching VisIt. For example, “visit -assume_format CTRL” will try the LMTO CTRL reader before reverting to its automatic detection code, and “visit -assume_format LAMMPS” will try the two LAMMPS readers first.

#### 10.3.5.1. VASP (CHGCAR, POSCAR, OUTCAR) File Formats¶

The VASP code, as described in the link, is “a package for performing ab-initio quantum-mechanical molecular dynamics (MD) using pseudopotentials and a plane wave basis set.” Its output is ASCII text in several files, and the VASP reader in VisIt supports “OUTCAR” and “POSCAR” for varieties of atomic positions and variables, and “CHGCAR” for charge density grids.

Since the charge density grids can get very large, the VisIt CHGCAR reader is actually parallelized to help speed the ASCII-binary conversion process on multi-node machines when using the MPI-enabled version of VisIt’s computation engine. It will decompose the grid into as many domains as you have processors, and each will read and process its chunk of data. Since this is an ASCII format, the speedup for the I/O portion will not scale to large numbers of processors, but the decomposition will also help the rest of the pipeline scale in parallel for other compute-intensive operations.

#### 10.3.5.2. LAMMPS (input structure and output dump) File Formats¶

LAMMPS is the “Large-scale Atomic/Molecular Massively Parallel Simulator”. The VisIt LAMMPS reader supports two flavors of data files used with LAMMPS.

The first is the output dump file in Atom style, usually ending in “.dump”. Here’s a small example of that format with three variables per atom (the final three columns):

ITEM: TIMESTEP
1500
ITEM: NUMBER OF ATOMS
5
ITEM: BOX BOUNDS
0.0 2.0
0.0 3.0
0.0 2.5
ITEM: ATOMS
2 1  0.0 0.0 1.0  0 0 0
4 1  2.0 3.0 2.5  0 0 0
1 2  1.4 0.7 0.0  0 3 1
3 2  0.3 1.0 0.5  0 1 7
5 2  1.7 2.0 0.2  0 7 7


In this example, the second and fourth atoms are of the first species type, and the first, third, and fifth are of a second species. So you’ll need to create an enumerate expression to create the atomic numbers needed for various molecular operations. For example, create a variable called “element”, of type Scalar Mesh Variable, and define it as “enumerate(species, [1, 8])” – this maps the first species to hydrogen, and the second to oxygen.

Note that the LAMMPS Atom-style dump has changed: the ITEM line with ATOMS now specifies the columns which were be written out. To continue supporting the old atom-style dump format, the reader assumes a format string of “id type x y z” (i.e. unscaled atom coordinates) if the line only contains the word “ATOM” with no format specified. The new default is “id type xs ys zs” (scaled atom coordinates) for the updated format. See the LAMMPS documentation of the “dump” command for details.

The second format is the input format used for the LAMMPS “read_data” command. Its file extension is not standardized, but can sometimes be “.eam”, “.meam”, and “.rigid”.

Position data on strange chemical

5       atoms
2       atom types
0.0 2.0     xlo xhi
0.0 3.0     ylo yhi
0.0 2.5     zlo zhi

Atoms

2    1      0.0           0.0           1.0
4    1      2.0           3.0           2.5
1    2      1.4           0.7           0.0
3    2      0.3           1.0           0.5
5    2      1.7           2.0           0.2


(As an aside, note that there is a “proper” EAM file containing pair potentials. Though the “EAM” refers to the embedded atom potential method in both usages, these are different files.)

#### 10.3.5.3. The ProteinDataBank (.pdb) File Format¶

The Protein Data Bank (PDB) archive contains molecular files in a standard ASCII format. The format, however, is used for a wide range of molecular data, not just proteins. See the docs for a full description of the file format. The PDB reader supports ATOM, HETATM, HETNAM, MODEL/ENDMDL, TITLE, SOURCE, CONECT, and COMPND directives.

This is a simple example of a 2-compound, 4-element type data file with a single model.

COMPND    First
ATOM      1  N   TYR A   1      27.557 -46.589  10.074  1.00  0.00           N
ATOM      2  H   TYR A   1      28.603 -46.872   9.068  1.00  0.00           H
COMPND    Second
ATOM      3  C   TYR A   1      29.675 -45.772   8.980  1.00  0.00           C
ATOM      4  O   TYR A   1      30.403 -45.678   7.992  1.00  0.00           O


#### 10.3.5.4. The XYZ File Format¶

The .xyz file format is a simple ASCII format used for describing atom positions, species, possibly variables, and possibly with multiple time steps. Here’s a simple example file:

   3
Some file comment
H      22.3844     2.0352     0.0000
O      18.4512     3.5123     0.0000
Cu     14.2455     6.1056     7.3436


Note that the first line lists the number of atoms, the second is a comment (or blank), and the third starts the data. In each data line, there is the element name, then the X, Y, and Z coordinates. Note that you may have several variables after the Z coordinate – VisIt will allow up to 6 extra variables. Below is an example with three extra variables, which will be called “var0” through “var2” inside VisIt, and can be combined into vectors or included in any other plotting or analysis operation VisIt supports.

3

H      22.3844     2.0352     0.0000     7   7.8    8
O      18.4512     3.5123     0.0000    12   1.6    9
Cu     14.2455     6.1056     7.3436    10   1.4   10


To support multiple timesteps in a single file, simply concatenate each timestep at the end of the previous one, with no blank lines or other separators. The VisIt XYZ reader also supports atomic numbers instead of element symbols in the first column and also supports the rather dissimilar CrystalMaker flavor of .xyz file (which we don’t describe here).

Wikipedia has a page on the XYZ format, though it does not mention the possibility of extra variables or multiple timesteps, both of which are supported by VisIt.

#### 10.3.5.5. The LMTO CTRL File Format¶

The CTRL file is a format used by the STUTTGART TB-LMTO program. LMTO is the linear muffin-tin orbital method used in density functional theory (DFT). This CTRL reader supports the STRUC, CLASS, SITE, ALAT, and PLAT file categories. (See this page for more details.)

#### 10.3.5.6. Using the VTK File Format for Molecular Data¶

The VTK file format is well-understood by VisIt, as it is the underlying low-level data model for many of its internal data types. The VTK structure best used for molecular data is that of a “vtkPolyData” type, where the vertices are the atoms, lines are the bonds (if desired), and fields on the atoms are point data fields. An example of an approximate of a water molecule in the ASCII VTK file format is shown below:

# vtk DataFile Version 3.0
vtk output
ASCII
DATASET POLYDATA

POINTS 3 float
1.0 0.5 1.5
0.2 0.1 0.8
0.4 0.2 2.3

LINES 2 6
2 0 1
2 0 2

VERTICES 3 6
1 0
1 1
1 2

POINT_DATA 3
SCALARS element float
LOOKUP_TABLE default
8 1 1
SCALARS somefield float
LOOKUP_TABLE default
0.687 0.262 0.185


If you have no bonds in the file, or would prefer to use the Create Bonds operator to generate them inside VisIt, simply drop the three lines of text in the “LINES” section of the file. For more detailed information about the VTK formats, see http://www.vtk.org/VTK/img/file-formats.pdf. Note that what are called the “Legacy” formats are both simpler and may be more widely supported than the more recent, and complex, XML formats

#### 10.3.5.7. Acknowledgements¶

This work was supported in part by the Department of Energy (DOE) Office of Basic Energy Sciences (BES), through the Center for Nanophase Materials Sciences (CNMS) and Oak Ridge National Laboratory (ORNL), as well as the Advanced Simulation and Computing (ASC) Program through Lawrence Livermore National Laboratory (LLNL).