Brain Addressing System

The brain addressing system (BAS) contains a minimal set of elements needed to unambiguously describe the position of an object or measurement in a reference brain. It consists of four components:

  1. A compact notation to reference brain atlases and common parameters.
  2. Definitions of brain atlases and their associated spatial reference systems.
  3. Transformations to convert data between different atlases.
  4. A central registry of all organizations that hosts brain atlas definitions (component 2) and/or transformations (component 3).

In this document, we use the term BAS-providers for host organizations listed by component 4; services that rely on BAS are called BAS-clients and data sets that contain BAS meta data are called BAS-users.

BAS solves the following issue: A data set is said to be using coordinates from "atlas A", but it is not clear which variant of "atlas A" it actually uses in terms of its units, resolution, origin landmark, axis orientation, version and supporting imaging data. Note that in BAS, the word atlas is used to denote a spatial reference system (SRS) that finds its basis in a set of brain-related data modalities and the physical space in which they place the brain.

An example that illustrates the need for BAS is the Allen Mouse Brain Atlas. This is a widely used atlas, but it is not well specified. The atlas is defined by an 'Atlas volume' that contains anatomical data in a three-dimensional grid. It is accompanied by an 'Annotation volume' that links each voxel to a class-label that points to a node in a brain structure ontology. The atlas is available in many voxel resolutions, of which 25 micrometer seems to be the default. The voxel (0,0,0) acts as the origin of the atlas, this point has no structural meaning (it is outside the brain). The anatomical order of the axes are such that (positive X, positive Y and positive Z) points to the (posterior, inferior, right) side of the brain. The atlas has gone through two re-alignment phases, so that its current version number is 3.

When data sets are linked to this atlas, many relevant properties are often not specified. Research groups tend to have a favorite way of anatomical orientation, for example labs that work with imaging data stored as Nifti-files tend to prefer Right-Anterior-Superior (RAS) coordinate systems. Also, some software packages take the center of the volume as the origin, and calculate positions in milli/micrometers from there. And an anatomist may want to work with locations that are related to an origin that is placed at a landmark location in the brain.

The BAS is designed to make it as simple as possible to define and use brain atlases without ambiguities, in such a way that data sets that deviate from the standard can be made compliant with a minimum of added meta data.

Referencing an atlas

To reference an atlas, a data set needs to specify

  1. An atlas provider, in the form of a URL that points to a set of atlas definitions. In certain scopes a short provider-prefix is used instead. The provider can be omitted when the atlas is defined within the data set itself.
  2. The atlas abbreviation, as specified by the atlas provider.
  3. The atlas orientation, unit and origin landmark.

For the above items 1 to 3, the following shorthand notation is used: {provider-prefix}.{abbreviation}_v{version}[{orientation},{unit}]@{landmark}, where terms wrapped in curly brackets are to be substituted, with brackets removed. Herein, the provider-prefix is a globally unique short name for the provider. For example, the prefix for provider https://scalablebrainatlas.incf.org/BAS is sba. The abbreviation is the short name of the atlas as specified by the provider. version (optional) is a number preceded by '_v' and represents the version of the atlas, see versioning. orientation is a three-letter code that specifies the direction of the positive x, y and z axes in terms of anatomical orientations from the set (Left, Right, Anterior, Posterior, Inferior, Superior), and must contain one letter from each pair of opposite directions (L,R), (A,P) and (I,S). The unit is one of m, mm, um (micrometer), nm. The unit can also be a numeric value that specifies the unit in meters, like 1e-6 instead of um, or 0.0254 to specify inches. The origin landmark can be one of the landmarks listed in the atlas definition, or one of the three pre-defined landmarks: zero, center and corner. The zero landmark is positioned at the coordinate (0,0,0) inside the enclosing volume of the atlas. The center landmark is positioned at the center of the enclosing volume. The corner landmark is special, in that its position depends on the specified orientation. It corresponds to the corner of the enclosing volume that is opposite to the specified orientation. So if the orientation is RAS, then the corner landmark is at the LPI-corner (opposite to the RAS-corner) of the enclosing volume. Note that in some data formats and software (example: ITK-Snap) an opposite anatomical orientation code is used, where the three letters refer to the negative x, y and z axes.

Default values for orientation, unit and origin landmark are RAS (=right,anterior,superior), 'mm' and 'zero', even if the original atlas creator had a different preference. The reason for this is that by having the same defaults for all atlases, a BAS-client knows the correct settings immediately, without having to connect to the atlas provider to retrieve its defaults.

The following are examples of valid ways to reference the Allen Mouse Brain Atlas, as exposed by provider sba.

It is recommended for clarity to always specify an origin landmark explicitly. One special case of the BAS-shorthand is the term local, which refers to the coordinate space of the data before registration to an atlas.

Versioning of an atlas

Defining different versions of an atlas only makes sense if some aspects of the atlas remain constant. We put the following requirements on different versions of the same atlas, which follows the idea that the underlying data modalities of the atlas have been updated, but care has been taken to accommodate data registered to previous versions of the atlas.

  1. The species, strain and age-category must remain the same.
  2. The enclosing volume must remain identical. This also implies that the pre-defined landmarks zero, center and corner remain the same. Other landmarks may slightly shift position.
  3. The best affine transformation between two versions of the same atlas must be the identity transformation. In other words, it should not be necessary to re-register data aligned with a previous version of the atlas. Nonlinear/deformable transformations do need to be re-run.

If these requirements are not met, then the atlas must be given a new name (i.e. not use the _v{version} convention.

Conveying BAS metadata to and from BAS-clients

BAS-clients are programs and services that perform atlasing tasks and thereby read and/or write BAS metadata. When reading/writing a data file, the BAS-client must know where to find/put the BAS metadata. We define the following mechanisms to convey this information:

  1. The BAS-metadata can be placed inside the datafile in a designated field. In existing data formats, this would typically be a free-text comment field. The downside of this approach is that per filetype it needs to be specified where to look for the BAS-metadata. There can also be a size limitation. In Nifti1-files for example, the 'description' field would be a suitable candidate but it is limited to 80 characters.
  2. The BAS-metadata can be put in a small XML or JSON string and be stored separately from the file in a database, or jointly with the data in a zip-file.
  3. A third possibility is to include the BAS-shorthand in the filename of the data. For example, a file named myimage.nrrd could be renamed to myimage.bas{sba.ABA_v3[RAS,um]@ac}.nrrd. This would indicate that the data, after the nrrd-defined transformation from voxel to physical space, is in sba.ABA_v3[RAS,um]@ac space. Here, the .bas{*} wrapper is used to indicate that a piece of BAS-shorthand is included. This third way has the advantage that only a simple file-renaming is needed; it is a practical and transparent solution that prevents multiple versions of the same file. It is a very convenient way to prepare downloaded files for use with a BAS-client without having to touch the file contents. The downside is that no BAS-metadata beyond the BAS-shorthand can be stored in this way.

To support BAS-shorthands in file names, atlas and landmark names and the BAS-provider prefix must start with a letter and thereafter consist of characters from the following regular expression set: [a-zA-Z0-9_-+#]. The maximum length of the shorthand is 80 characters, as set forth in the formal shorthand specification.

With three different ways to convey BAS metadata, there is a risk of conflicts. To resolve such conflicts, we rank the three methods by importance: the most important is metadata passed along separately from the file (i.e. method 2). Second most important is the BAS-shorthand in the filename, and least important is the BAS metadata inside the file.

What's wrong with existing data formats

One may argue that most neuroimaging data formats already have ways to specify anatomical orientation, resolution, units and origin. This is partially true. While volumetric formats often contain a field (or transformation) to tell which voxel is set as the origin, there is no way to say what landmark this origin actually represents. Neither is there a way to tell which reference atlas the data is registered to. In addition, many data files contain default settings for the meta data fields, and these may be wrong. For example, the Nifti1 standard requires that the file contains an affine transformation that converts voxel space to a physical space with RAS orientation, but often this is not taken care of. Another example are .nrrd files available through the Allen Institute API. They have their anatomical orientation set to 'left-posterior-superior' while in fact the spatial axes point in the right-inferior-posterior directions. What BAS contributes is that it 1) specifies the atlas and origin landmark of the data, and 2) transparently overrides missing or wrong meta data.

Defining an atlas

While this may seem to be the most daunting task, the minimal description of an atlas is surprisingly simple. It consists of the abbreviated name of the atlas, a reference to the atlas defining document (by DOI or URL) and/or defining datasets (by DOI or URL), the coordinates of the enclosing rectangular volume, plus the position of at least one landmark within this volume. In the common case that the atlas reference space is defined by an MRI volume, the enclosing volume is just the voxel size multiplied by the number of voxels in each dimension. In the case of a stereotaxic atlas that consists of annotated microscopic sections, the volume consists of the stereotaxic coordinates in each direction that approximately mark the edges of the brain, for example for the Paxinos & Franklin mouse atlas, the enclosing volume runs from about -5 to 5 in the left-right dimension, -9 to 7 in the posterior-anterior dimension, and -7 to 0 in the inferior-superior dimension, all in mm and with respect to bregma. The choice of the enclosing volume does not have to be precise, it is the location of the landmarks that determines precision and what data sets will refer to. The bregma landmark is at coordinate (0,0,0) while the automatically defined center landmark is at RAS-coordinate (0, -1, -3.5). The atlas also defines a second origin, the interaural midpoint, at coordinate (0, -3.8, -5.8).

The actual definition document of each atlas is an XML or JSON file that follows the BAS xsd-schema. An invertible conversion is used to switch between XML and JSON.

The minimal XML specification of the Paxinos-Franklin atlas is illustrated below. Note that this specification describes the data that underlies the atlas, but does not necessarily provide access to it.

<SpatialReferenceSystem id="PF01">
  <EnclosingVolume minR="-5" maxR="5" minA="-9" maxA="7" minS="-7" maxS="0"/>
  <Landmark id="bregma" coordR="0" coordA="0" coordS="0">
  <Landmark id="interaural" coordR="0" coordA="-3.8" coordS="-5.8">
  <DefiningDocuments>
    <Book id="PF01" year="2001" doi="10.1016/S0306-4530(03)00088-X">
      <Author>Paxinos G</Author>
      <Author>Franklin KBJ</Author>
    </Book>
  <DefiningDocuments/>
  <DefiningDataSets>
    <SliceStack id="labels "sliceDim="A" sliceCount="100" sliceSpacing="120um" modality="annotation"/>
    <SliceStack id="nissl" sliceDim="A" sliceCount="100" sliceSpacing="120um" modality="nissl-stain"/>
  </DefiningDataSets>
</SpatialReferenceSystem>

Defining transformations

http://nipy.org/nibabel/coordinate_systems.html

Transformations lie at the heart of digital atlasing, as they make it possible to integrate data across modalities and scales. We distinguish three categories of transformations. The first category concerns axes flips/swaps, uniform scaling and translation. These very common transformations are covered by the BAS-shorthand notation, where the anatomical orientation permutes axes, setting the unit performs uniform scaling, and selecting an origin landmark translates the data.\ The second category concerns affine transformations, whereby a 3-dimensional position in source space, augmented with a 1, is multiplied by a 4x4 matrix to result in a position in target space. These transformations are defined in BAS by selecting a 'from' SRS, a 'to' SRS, and a transformation matrix. We borrow the XML notation for this from the x3dom project, which provides two ways to specify the affine transform, either as a matrix or as a combination of translation, rotation and scaling.\ The third category consists of 'everything else'. Here it is not clear beforehand what parameters the transformation needs. To this end, we introduce the concept of transform-providers, which are online resources that define transforms, describe how to use them and define the associated parameters. Typically, the parameters are copied from the transform parameter file of the software package that was used to compute the nonlinear transform. The syntax of these files is covered by the aforementioned BAS xsd-schema.

Multiple BAS-providers

The BAS is an open specification that can be used by multiple atlas providers with only one aspect that needs regulation: the chosen provider-prefix, which must be globally unique. For that purpose, a central table of provider-urls and provider-prefixes is to be maintained. The address of that table is https://incf.org/scalablebrainatlas/BAS/providers.xml, and the procedure to become a provider is described at https://incf.org/scalablebrainatlas/BAS/become-a-provider.html. This table has the additional benefit that BAS-clients can harvest it periodically and visit each provider to build a cache of atlas definitions, transform definitions and inter-atlas transformations.

Another issue with multiple providers is that the same atlas may be defined multiple times, for example there could be atlases sba.ABA_v3 and hbp.AMBA_v3 which are both pointing to version 3 of the Allen mouse brain atlas. This is no problem in itself, but a mechanism is needed that tells BAS-clients that there is a trivial mapping between the two atlases. The solution is that a BAS-provider, as soon as it is aware of a clone, adds an identity transformation to its public list of inter-atlas transformations.

Formal BAS-shorthand specificationJavascript

The following is a formal specification of the BAS-shorthand, by means of commented Javascript code that composes a regular expression that the shorthand must match.

// provider prefix, max 8 characters
const pp = "([a-zA-Z][a-zA-Z0-9_\\-+#]{1,7}\\.)";

// atlas name (spatial reference system), max. 16 characters
const srs = "([a-zA-Z][a-zA-Z0-9_\\-+#]{1,15})";

// sub-expression for anatomical orientation
const ao = "([LR][PA][IS]|[LR][IS][PA]|[IS][LR][PA]|[IS][PA][LR]|[PA][IS][LR]|[PA][LR][IS])";

// sub-expression for unsigned floating point, max. 19 characters
const fp = "(?:[0-9]{1,14}[.]?|[0-9]{0,5}[.][0-9]{0,9})(?:[eE][-+]?[0-9]{1,2})?";

// sub-expression for the unit field
const ut = "(m|mm|um|nm|(?:"+fp+"))";

// square brackets part: combine anatomical orientation and unit in arbitrary order
const sb = "(\\["+ao+"(?:,"+ut+")?\\]|\\["+ut+"(?:,"+ao+")?\\])";

// origin landmark part
const lm = "(@[a-zA-Z][a-zA-Z0-9_\\-+#]{1,23})";

// longest possible shorthand has length 75 without bas{*} wrapper, 80 with wrapper
const longestPossibleShorthand = "a1234567.a123456789abcdef[RAS,12345.123456789e-12]@a123456789abcdef01234567"

// compose the final regular expression, and check if the longest possible shorthand matches.
const R = new RegExp("^"+pp+"?"+srs+sb+"?"+lm+"?$","g");
const matches = longestPossibleShorthand.match( R );

XML/JSON specification

Besides using the BAS-shorthand notation, all metadata about BAS is specified in the form of XML/JSON documents. The XML-variant that is described here can be converted to equivalent JSON using the XML-to-JSON service. Each BAS-provider creates a file hierarchy consisting of the following XML-documents:

  1. In the root folder, there is the document "index.xml". It lists all atlases that the BAS-provider exposes, including their id, name, species and strain. It also lists all transform that the BAS-provider exposes, including their id, name and package.
  2. Next, inside the folder 'atlases', each atlas has its own folder named after its id, and so does each transform.
  3. Inside each atlas folder, there is a file index.xml that contains the atlas definition, and inside each transform folder there is a file index.xml that contains the transform definition.

References

PaxinosFranklinMouse