BioCatNet File Template for Biocatalytic Data

To facilitate documentation and submission of biocatalytic data, we provide an Excel file to unambiguously describe an experiment. An empty file template can be downloaded as BioCatNet_template.xlsx.

This file template can be filled in and sent to the BioCatNet administrator. Please give the name of the desired enzyme family database in your e-mail.

Note that one file will correspond to a single experiment!

BioCatNet assigns experiments to experiment sets. An example would be a series of measurements under varying pH. An individual experiment would comprise the result for a distinct set of conditions, i.e. an individual pH adjustment. Results obtained for all the other pH values should be documented in separate files (separate experiments).

Examples of filled-in files can be downloaded here:

Sections of the BioCatNet File

The file template contains the following tabs:

  1. user information
  2. sequences
  3. reaction
  4. conditions
  5. measurement

Sections are marked with a number sign and capital letters. The first line below the section marker is the header line (in italics). Mandatory sections are marked by capital letters. Mandatory input is indicated by headers in capital letters and thick borders. Mandatory units are indicated in the header lines by square brackets. Type "boolean" refers to digits 0 or 1.

Insert a new row, if you need more than one set of attributes (e.g. for multiple time points of measurement or more than one substrate). For empty values just leave a blank field. Please do not delete any content from the sections or headers! Additional information can be added at the end of each sheet. This additions will not be inserted into BioCatNet automatically.

1. User Information

The e-mail address under EMAIL should correspond to your user account on the respective BioCatNet database. The field under NAME should be of type 'first name followed by last name'. It will be used for correspondence by the administrator (like notifications about failed or successful submission). Please register for your favorite enzyme family database before submitting the file. A list of available databases can be found on the BioCatNet Homepage.

File screenshot

The # EXPERIMENT section contains the name for the individual experiment below the NAME header. The field under EXPERIMENT SET NAME should be indentical for all members of this set.

2. Sequences

The # SEQUENCES section should list all protein sequences which were used during the experiment, including mutations or tags.

File screenshot

The field under NAME will be used as annotation. The field source organism is not mandatory but highly recommended. (The name of the expression host will be added under the # ENZYMES section in reaction tab.) The input under AMINO ACID SEQUENCE will be used for a BLAST search against the respective enzyme family database. If the sequence already exists, the annotated name of the already existing sequence can be used instead of your submitted name. Each sequence should be entered as one-liner (for a nicer representation within the file). If you have more than one enzyme in your experiment, just add as many lines as you need. One line should correspond to a single enzyme.

Please note that the NAME of the sequence should match the information SEQUENCE NAME in the #ENZYMES section of the reaction tab and the # ENZYME TREATMENT section of the conditions tab.

If you wish to have additional annotations for your protein sequence, e.g. annotation of mutated residues, please contact the BioCatNet administrator.

3. Reaction

This tab describes the reaction(s) under investigation. The # COMPOUNDS section should list all applied compounds, i.e. substrates, products and (where applicable) additives (e.g. inhibitors or detergents which are added in the course of the experiment). To unambiguously identify a compound, the usage of the SMILES code is recommended. This simplified molecular-input line-entry system code represents the structure of a compound as a string. You can retrieve SMILES code for your compounds from other repositories, e.g. NCBI PubChem, or generate SMILES Code using external tools like PubChem Sketcher.

File screenshot

The NAME field within the # REACTION section provides an annotation for the investigated reaction. (For cascade systems there is more than a single reaction, so just add new lines in this section when your experiment investigates more than one reaction.) The field REACTION specifies the reaction stoichiometry by using coefficients and compound names from the
# COMPOUNDS section. A reaction converting 1 mol of substrate A and 1 mol of substrate B to form 1 mol of product C would be given as 1 A + 1 B -> C. Note that the stoichiometric coefficients are given, even if they equal one. The reaction equation contains -> to distinguish between substrate(s) and product(s).

The # ENZYMES section should repeat the NAME from the # SEQUENCES section below the SEQUENCE NAME field. The time point of addition is a mandatory input. Currently, second [s] is the only valid unit for time (we currently amend the file template and our databases for usage of more convenient user-defined units).

Note that the fields below COMPOUNDS in the # SUBSTRATES and # additives section should repeat
NAME from the # COMPOUNDS section.

The amount of enzyme can be specified by various units, e.g. U/ml, U/l, g, mg (e.g. from Bradford assay), mol, mmol, umol (for µmol), nmol.

Please quantify your enzyme(s) either by concentration (in M, i.e. mol/l) or by amount and volume (e.g. volume added from stock solution). For substrate(s) or additive(s) please choose between quantification by concentration or by amount & volume. Additives can comprise inhibitors, detergents or other compounds which are neither substrate nor product. Specification of amount & volume should include mentioning of the initial reaction volume (initial volume) in the # REACTION CONDITIONS section of the conditions tab.

4. Conditions

The # REACTION CONDITIONS section can comprise multiple attributes. The entry below the description field can contain detailed information (up to 600 characters), e.g. reaction time, additions of cofactor, ions or co-solvents. (This information might be split into additional fields in a new release of our file template.) Please note the units in square brackets for the respective entries. The term shaking corresponds to the shaking of the reaction, not the host culture. The unit for shaking is rpm, i.e. rotations per minute, the (SI) unit for pressure is Pa.

File screenshot

The # ENZYME TREATMENT section should repeat the NAME entry from the # SEQUENCES section. The EXPRESSION HOST should be specified in detail, i.e. by mentioning of the strain. The fields below preparation/purification can give additional information which might be of interest to other experimenters, e.g. keywords mentioning the purification method of the applied enzyme. The boolean attributes purified enzyme, crude cell, immobilized enzyme and whole cell will facilitate database searches for experiments by these criteria (e.g. search for experiments using immobilized enzyme).

The # BUFFER section contains the NAME as well as a more detailed DESCRIPTION of the used reaction buffer.

5. Measurement

The measurement tab contains two sections: # measured compounds and # measured parameters. File screenshot

The # measured compounds section can list the time course of concentrations for susbtrate(s) and/or product(s). Currently, we support molar concentrations [M] and times in seconds [s]. The measurement method can be given as plain text. The replication number can specify replicates taken during the experiment.

The # measured parameters section can contain measured ratios (in %) like yield, conversion or enantiomeric excess (ee). If you plan to submit other parameters, please contact the BioCatNet administrator. The observation time should be given in seconds [s]. Currently, we support Enantiomeric Excess (S), Enantiomeric Excess (R), Yield and Conversion for parameter. Short names can be given for enantiomeric excess as ee (S) or ee (R) under parameter abbreviation.

The COMPOUND NAME should be identical to the NAME in the # COMPOUNDS section of the reaction tab. Plain text can be used to mention the detection method and additional information (e.g. equations describing a parameter). The reference sequence name should match the NAME from the # SEQUENCES section, e.g. when conversion of a substrate (observed for a mutant) refers to a wild type sequence given in the # SEQUENCES section.