FACTOR
Contents

The total number of entries in the FACTOR table ( Statistics   Statistics) does not reflect the number of independent transcription factors. First of all, homologous factors from different species, such as human and mouse SRF, are given in different entries since they may differ in some molecular aspects. Moreover, factors which have originally been described by different research groups to bind to different genes may turn out to be identical as soon as they have been cloned. On the other hand, more and more factors are recognized to be representatives of whole transcription factor families, comprising products of distinct but very similar genes or alternative splice products. In many cases, a more general term originally defining just a specific DNA-binding activity such as AP-1 appears as one entry. In most cases, this activity has not been analyzed for its subunit composition by members of the Jun and Fos families. Nevertheless, all fos- and jun-related proteins are included as separate entries.

All factors that are mentioned in the SITE table appear in the FACTOR table as well. However, it includes also polypeptides, which do not bind to DNA by themselves. One well-known example is c-Fos, which is forced to contact DNA only by being complexed with, e. g., c-Jun. Information about non-DNA binding subunits of transcription factor complexes, such as the TAFs, is given by FACTOR as well. There are also proteins that act as inhibitors for a particular DNA-binding activity and which are of regulatory importance. Therefore, proteins such as Id, lkappaB or hsp90 have been included in TRANSFAC® FACTOR.

On the other hand, proteins which carry a putative DNA-binding motif have in general not yet been entered. Thus, there are much more zinc finger proteins known than are included in FACTOR, but for many of them, no data about DNA-binding specificity or about other important gene-regulatory features are available.

In general, a protein is a potential entry for TRANSFAC® if it fits to the following definition:

"A transcription factor is a protein that regulates transcription (after nuclear translocation) by sequence-specific interaction with DNA or by stoichiometric interaction with a protein that can be assembled into a sequence-specific DNA-protein complex."



In addition to transcription factors, with release 10.4 we have started to include in the FACTOR table also micro RNAs (miRNA) that control stability or translation of messenger RNA (mRNA) by sequence-specific interaction.


back to the top   next

Fields

It should be noted that in individual entries some fields may be empty. In this case, these fields are not displayed.
  Factor summary   general information about the FACTOR entry:
number of features, of expression patterns, of linked binding sites, of interacting factors, of matrices, of linked external database entries, and of references.
AC Accession number   "T" + 5-digit number
AS Accession numbers, secondary   when two or more entries are merged, the additional accession numbers, separated by commas, are stored in this field
ID Identifier   "T" + 5-digit number (identical with accession number)
DT Created
Updated
  date of entry creation; entry author
date of last entry updating; updater
FA * Factor name   (normally the most commonly used) name of the factor (NOTE: Greek letters are expanded to alpha, beta, gamma etc.)
SY * Synonyms   alternative names of the transcription factor
OS Species   biological species (in some cases, when the species of the protein used in an experiment was not clearly given in a publication, the species is assigned to a "taxonomic class" (mammalia, vertebrata, ...); complex entries may have more than one species assigned, when the subunits used in the experiment were derived from different species)
OC Taxonomic classification   systematic biological classification of the species
GE Encoding gene   GENE accession no.; short gene term; HGNC: standard gene symbol.
HO Homologs   suggested homologous transcription factors from distant biological species (e. g., yeast HAP2 as homolog to human CP1B)
CL Factor class   assignment of the factor to the comprehensive transcription factor classification (class accession number linked to the CLASS table entry; class identifier; decimal classification number linked to the classification tree.)
TY Type   the type of this factor entry (not yet given in all entries); possible values are:
family, group entry which summarizes different products of (closely related) paralogues genes
isogroup, group entry which summarizes different products (e.g. alternative splice variants) of the same gene
basic, for specific isoforms (concrete existing monomeric proteins)
complex, factors consisting of more than one non-covalently bound protein/molecule
miRNA, micro RNA
HP Superfamilies   lists generic entries (isogroup or family) to which this factor belongs
HC Subfamilies   lists entries, e.g. splice variants or family members of this isogroup/family entry
SZ Size   length (number of amino acids); calculated molecular mass in kDa (derived from cDNA / genomic clones); experimental molecular mass (or range) in kDa (experimental method, e. g. SDS PAGE, GF/gel filtration)
SQ Sequence   protein sequence of the factor (for miRNAs: RNA sequence)
SC Sequence source   source of the (protein) sequence (e. g., SwissProt, PIR, MIRBASE)
FT Feature table   local features of the factor molecule:
first position   last position   feature
SF Structural features   global structural features of the factor
CP Cell specificity (positive)   organs / cells in which the factor has been demonstrated to be expressed
CN Cell specificity (negative)   organs / cells in which the factor has been demonstrated NOT to be expressed
EX Expression pattern   organ, cell name, system, developmental stage; relative level of expression (very high, high, medium, low, very low, detectable or none); detection method; molecule type detected, i.e. RNA or protein; [reference]
FF Functional properties   funtional properties of the factor including more detailed explanations of its expression pattern and of its regulation
IN Interacting factors   factors which interact physically with the factor of this entry (as the applied methods may also include co-immunoprecipitation from crude cell extracts, or similar, it cannot be excluded that in some cases the binding between the two proteins was mediated by a third protein) (linked accession number; name; biological species)
ST Subunits (Precursors)   subunits of the given factor (complex), or the precursor/unmodified form (for non complexes)
CX Complexes   a list of complexes which contain this factor
MX Matrices   MATRIX table entries providing DNA-binding profiles of the factor (linked accession number; identifier.)
BS Binding sites / Regulated genes   DNA (or RNA) sequences shown to be bound by the factor (linked accession number; identifier; "Quality" of the factor-site interaction on a six level scale;) and for genomic sites: (short gene term; GENE accession no. biological species.)
BR Binding region (ChIP-chip)   DNA fragment from ChIP-on-chip experiments (linked accession number; "Quality" of the factor-DNA interaction; biological species.)
DR External database links   database name (e. g. BKL, EMBL, SwissProt, PIR, Flybase, PDB, DATF, MIRBASE, TRANSCompel, PathoDB, SMARtDB, TRANSPATH): database accession number, identifier (where available).

in case of EMBL cross-links: (r) denotes reference to a RNA/cDNA, (g) to a genomic DNA sequence.

RSNP: accession number; EMBL: accession number; pos: SNP position in EMBL sequence; var: variation introduced by SNP; effect of SNP (example–> RSNP: 97894; EMBL: M61108; pos: 716; var: a,g; amino acid exchange, A47–>T);
RN Reference number   [consecutive entry reference number]; reference accession number.
RX PUBMED; link to PubMed entry.
RA Reference authors authors (NOTE: accents are omitted, German umlauts are transcribed as follows: ä -> ae, ö -> oe, ü -> ue; German "s-z" (ß) -> ss)
RT Reference title reference title (NOTE: Greek letters are expanded to alpha, beta, gamma etc.)
RL Reference source journal volume:pages (year)

* These fields are commonly searched

back to the top   next

Explanations

In the present release, no identifiers have yet been assigned to the FACTOR entries, instead the accession numbers are repeated.

The field "Synonyms" covers different spelling (AP-1/AP1) as well as real alternative names (HNF-1: HNF-1alpha, APF, LF-B1). In contrast, the field "Homologs" indicates the names of other proteins, frequently from evolutionary more distant species, which may be functionally and/or structurally related to the factor under consideration.

The field "Factor classification" indicates the major class of DNA-binding domains a factor may be assigned to. It also contains a systematic decimal classification number referring to the proposed transcription factor classification scheme. Note that this is a tentative assignment which may change according to the insights into the structure-function relationships of this large protein category.

The "Size" field shows the number of amino acid residues of a polypeptide and its molecular weight. The method by which this figure has been obtained is indicated in brackets; (cDNA) or (gene) means that is has been calculated after cloning, (SDS) or (sedim.) hints on the corresponding experimental approaches.

The "Sequence" field contains the full amino acid sequence of the transcription factor. It may have been copied from SwissProt or PIR or conceptually translated from an EMBL/GenBank/DDBJ nucleic acid sequence, as is indicated in the "Sequence comment" field. In case that some manual editing has been done, this is also indicated in this line.

The "Feature table" may contain information on:

  • regions that are enriched in some amino acid residues and may therefore represent trans-activating domains; the content is given as (M/N) which means that M out of N residues are of the enriched amino acid;
  • positions of the typical DNA-binding/dimerization motifs and the motif structure within the individual molecule; e. g. tryptophan cluster motifs are explained with regard to Trp spacing, and the nature of a leucine zipper is given as well (e. g. L4 means that it consists of four leucine residues spaced by 6-AA-intervals, L2EL2 indicates a motif such as L-X6-L-X6-E-X6-L-X6-L);
  • the AA that coordinate the zinc ion(s) in a zinc finger motif (e. g. C2HC for three cysteines and one histidine)
  • posttranslational modifications (phosphorylation, glycosylation).

Positional features are visualized between the "Feature table" and the "Structural features", individual features can be assigned by the colour code given in the feature table.

The field "Structural features" gives information about global structural features of the factor.

Data may be referenced, the source of information used is indicated by a bracketed number that points to the corresponding paper at the end of the entry.

"Cell specificity (positive)" gives predominant occurrence of a factor in certain cell types or tissues. Occasionally, cells from which the factor has been isolated are indicated in brackets; this information does not necessarily point to a true cell specificity. Additionally, "Cell specificity (negative)" lists cells / tissues which have been proven not to express the corresponding factor. For factors from human or mouse the "Cell specificity" fields have been started to be replaced by the better structured "Expression pattern" field.

Interactions: Since most transcription factors bind to DNA as dimers, the dimerization partners are indicated in this field. Repeating the factor's name in this field means that it forms homodimers. Also given are inhibitory protein-protein-interactions such as NF-kappaB - lkappaB.

The field "Matrix" gives accession number and identifier of the connected MATRIX table entries.

"External databases" points to corresponding entries within the EMBL, SwissProt, PIR, FlyBase, PDB, RSNP, TRANSCompel®, PathoDB®, SMARt DBTM or TRANSPATH® data libraries ( Statistics   Statistics).

back to the top