The high mobility group (HMG) superfamily of non-histone chromosomal proteins encompasses all of the nuclear proteins that can be extracted from nuclei or chromatin itself with 0.35 M NaCl and which have a molecular mass below 30 kDa. They were first found and characterized by Goodwin and Johns in mammalian cells and was so named due to its high electrophoretic mobility in acid-urea gels. Another classical characteristic of the HMG family is its solubility in 2-5% trichloroacetic acid. They are the most abundant, ca. 105-106 copies per nucleus of each protein and ubiquitous non histone proteins found in the nuclei of higher eukaryotes. This group of protein is mostly found in the nucleus associated with chromatin, but in some cases they have also been found in the cytoplasm. Members of the HMG family have a conserved 70-75 amino acid domain termed the HMG box. These boxes are thought to regulate DNA conformation by their interaction with the minor groove of DNA. The HMG proteins recognize unique DNA structures and have been implicated in diverse functions, including that of determination of nucleosome structure and stability, and also in transcription and/or replication and V(D)J recombination.
Types of HMG Proteins
For higher eukaryotes, the HMG group of proteins is further divided into three subfamilies: HMGB1/2, HMG-14/-17, and HMG I/Y. Each of these subfamilies is grouped on the basis of the molecular mass of the protein, amino acid motifs, and DNA binding characteristics.
HMGB1/2 is the largest of the three subfamilies. Members of this subfamily have a molecular mass of approximately 25 kDa and are found in great abundance in eukaryotic nuclei although they are non-histone-associated proteins. Although HMGB-1/-2 can bind to both single and double stranded DNA, they show preference for the former. These proteins can distinguish between different single stranded conformations and show a preference for cruciform DNA. Experimental findings suggest that they play a role in chromosomal replication. Antibodies to HMGB- 1/-2 proteins inhibit DNA synthesis and the proteins are reported to stimulate DNA polymerases.
HMGB1 itself is present in ca. 105-106 copies per nucleus. Further studies have indicated that HMGB1/2 proteins are present in the cytoplasm as well. Other research demonstrates that the amount and the localization of the protein vary among different tissues. HMGB1 location may also be dependent on the stage of the cell cycle that is present. An inverse correlation exists between the levels of HMGB1/2 and the level of HI histone. HMGB1 is found in the cytoplasm when there is little cell growth; however, when cells are actively dividing it is found in the nucleus.
The HMG-14/-17 group of proteins has a molecular weight of 10-12 kDa. They are the only non-histones known to have higher affinity for nucleosomes than for DNA.
The third group is formed by HMG-I and its isoform HMG-Y. These two proteins are similar to HMG-14/-17 in their electrophoretic mobility and in amino acid composition. An average mammalian cell with 3x109 bp genome contains 106 molecules of HMGB-1/-2,105 molecules of HMG-14/-17, and 104 of HMG-I/-Y, while core histone molecules were present at ~2xl07.
Bustin in 2001, reviewed and revised the nomenclature for HMG proteins. The HMG proteins consisted of three types: HMGB, HMGN, and HMGA. Each type has a characteristic functional sequence motif. The functional motif of HMGB type is called the HMG-box. HMBN type is called the nucleosomal binding domain and the HMGA type is called the AT-hook.
Proteins, which contain any of these functional motifs in their sequences, are known as HMG-motif-proteins. HMG-1/-2 therefore is known as HMGB- 1/-2, HMG-14/-17 is known as HMGN-1/-2, and HMG-I/-Y is known as HMGA-1/-2.
Characterization and Structure of HMGB-1/-2
Sequence comparison studies have demonstrated that there is approximately 80% sequence conservation in HMGB 1/2 proteins among a wide variety of eukaryotic species (Chinese hamster, pig, bovine, human). Bovine HMGB-1 is 99% and rat HMGB- 1 is 97% homologous to human HMGB-1 protein. This sequence conservation among species indicates that HMGB 1 may have an essential function in the cell. Additionally some of the HMGB1 proteins from Drosophila and Chironomus cells undergo post-translational modification in which the serine residues located in their acidic tails are phosphorylated. Phosphorylation appears to be a contributing factor to both the conformational and metabolic stability of the protein.
HMGB1 interacts with DNA in a non-sequence specific manner, but interestingly HMGB1 can aid in sequence-specific DNA reactions in vitro. However, new research has indicated that HMGB1 has rudimentary sequence recognition capabilities which when coupled with its cooperative interaction with the ZEBRA activator leads to binding specificity. Other evidence for rudimentary sequence recognition comes from NMR structural studies of non-histone protein 6A (NECP6A), a Saccharomyces cerevisiae homologue of HMGB1. The NHP6A amino acid residues of methionine and phenylalanine intercalate the DNA leading to rudimentary DNA binding specificity.
Both HMGB1 and HMGB2 have a tripartite structure. The first two regions contain a non-identical internal repeat forming the two HMG boxes called box A (1-79) and box B (90-163). These two domains are conserved from trout to humans, have a globular structure and can be isolated by controlled proteolytic digestion. Each of these domains is rich in charged amino acids with a net positive charge of +20. Residues 80-89 form a link between the two boxes. These boxes are not only able to bind DNA but they also can fold and/or bend it. The third domain of HMGB1 is the acidic carboxy-terminal tail, lacks an identifiable ordered structure. The first 20 residues have a net charge of +8, with lysine residues accounting for the positive charges. The last 29 residues make a continuous stretch of acidic amino acids, which is able to interact with histone HI in in vitro assays. A block schematic is shown below which details these regions in HMGB1.
HMGB-2 shows 86% identity in the A/B domains with HMGB-1, but in the acidic C terminal the identity is only 50% in the sequences. There are 30 negatively charged residues in the C terminus of HMGB-1 whereas HMGB-2 has 22 of them.