AMYLOLYTIC ENZYMES - FOCUS ON THE ALPHA-AMYLASES FROM ARCHAEA AND PLANTS

Amylolytic enzymes represent a group of starch hydrolases and related enzymes that are active towards the α-glycosidic bonds in starch and related polyand oligosaccharides. The three best known amylolytic enzymes are α-amylase, β-amylase and glucoamylase that, however, differ from each other by their amino acid sequences, three-dimensional structures, reaction mechanisms and catalytic machineries. In the sequence-based classification of all glycoside hydrolases (GHs) they have therefore been classified into the three independent families: GH13 (α-amylases), GH14 (β-amylases) and GH15 (glucoamylases). Some amylolytic enzymes have been placed to the families GH31 and GH57. The family GH13 together with the families GH70 and GH77 constitutes the clan GH-H, well-known as the α-amylase family. It contains more than 6,000 sequences and covers 30 various enzyme specificities sharing the conserved sequence regions, catalytic TIM-barrel fold, retaining reaction mechanism and catalytic triad. Among the GH13 α-amylases, those produced by plants and archaebacteria exhibit common sequence similarities that distinguish them from the α-amylases of the remaining taxonomic sources. Despite the close evolutionary relatedness between the plant and archaeal α-amylases, there are also specific differences that discriminate them from each other. These specific differences could be used in an effort to reveal the sequence-structural features responsible for the high thermostability of the α-amylases from Archaea.

Starch industry covers many well-developed and also recently established sophisticated technologies that utilize amylolytic enzymes.These amylases represent approximately 30% of the worldwide industrial enzyme production, the starch hydrolysis being considered to be the main way of their use ( VAN DER MAAREL et al., 2002).

Amylolytic enzymes
With regard to a complex structure of starch and related oligo-and polysaccharides the starch-degrading organisms have to dispose by relevant combination of starch hydrolases and related enzymes (LEGIN et al., 1998;BERTOLDO and ANTRANIKIAN, 2002).These enzymes are in general called amylases.
The amylolytic enzymes form a large group of starch hydrolases and related enzymes that are active towards starch, pullulan, glycogen and other related oligo-and polysaccharides (VIHINEN and MANTSALA, 1989;PANDEY et al., 2000;JANECEK, 2009).It is a common way of binding of a glucose residue of the substrate in the enzyme active centre, termed conventionally as a substrate-binding subsite (DAVIES et al., 1997), that is responsible for the activity of amylolytic enzymes.Most of them belong to glycoside hydrolases (GHs) that constitute the individual GH enzyme families without mutual sequence similarities (HENRISSAT, 1991).Now the GH families are part of the CAZy web-server (CANTAREL et al., 2009) that covers also other carbohydrate-active enzymes (Fig. 1).The most known amylolytic enzymes are α-amylase (EC 3.2.1.1),β-amylase (EC 3.2.1.2) and glucoamylase (EC 3.2.1.3)that are, however, quite different from each other.They differ not only in their primary and tertiary structures, but also in their catalytic machineries and reaction mechanisms employed (JANECEK, 1994a;PUJADAS et al., 1996;COUTINHO and REILLY, 1997).They have therefore been classified into different GH families: GH13 -α-amylases, GH14 -β-amylases, and GH15 -glucoamylases (HENRISSAT, 1991).
The enzymatic hydrolysis of a glycosidic bond can be characterized by a general acid catalysis that requires two essential components: a proton donor (an acid) and a nucleophile (a base).According to the anomeric configuration of the resulting hydroxyl group with regard to conformation of the cleaved O-glycosidic linkage, two basic mechanisms exist for this hydrolysis (Fig. 2): retaining or inverting (MCCARTER and WITHERS, 1994).Whereas α-amylase employs retaining mechanism (i.e. the products of its action are α-glucans), both β-amylase and glucoamylase are inverting hydrolases (i.e. they produce β-glucans).(SAUER et al., 2000).The catalytic base (top) and acid (bottom) in the water-assisted hydrolysis of substrate leading to inversion of the configuration of the anomeric carbon.
From the structural point of view (Fig. 3), both α-amylase and β-amylase rank among the TIM-barrel enzymes, i.e. they possess the (β/α) 8 -barrel catalytic domain, while glucoamylase adopts a helical version of catalytic TIM-barrel, the so-called (α/α) 6 -barrel.Within the CAZy classification the α-amylases from the family GH13 with closely related families GH70 and GH77 constitute the clan GH-H that is wellknown as the α-amylase family (MACGREGOR et al., 2001;CANTAREL et al. 2009).It is worth mentioning that some α-amylases with sequences and structures different from the main GH13 α-amylases have been placed to the family GH57 (JANECEK, 2005) and some amylolytic enzymes are present also in the family GH31 (NAKAI et al. 2005;KANG et al., 2008).
It thus could be summarised that amylases and related enzymes classified into the families GH13 (forming with GH70 and GH77 the clan GH-H), GH14, GH15 as well as GH31 and GH57 differ from each other by their amino acid sequences, threedimensional structures, catalytic machineries and reaction mechanism (JANECEK, 2009).

α-Amylase enzyme family
Most of amylolytic enzymes are grouped in the α-amylase family (MACGREGOR et al., 2001).It was originally recognised as a group of starch hydrolases and related enzymes (such as α-amylase, cyclodextrin glucanotransferase, neopullulanase, etc.) that exhibited sequence similarities and commonly predicted TIM-barrel fold (MACGREGOR and SVENSSON, 1989;TAKATA et al., 1992).Within the sequence-based classification of GHs, it was originally established as the family GH13 (HENRISSAT, 1991), but later the families GH70 and GH77 were added to form the presently well-accepted GH-H clan (MACGREGOR, 2005;JANECEK, 2009).

Clan GH-H
The above-mentioned families GH13, GH70 and GH77 form the clan GH-H, i.e. the α-amylase family, which at present consists of 30 various enzyme specificities (Table 1) and contains more than 6,000 sequences (CANTAREL et al., 2009).The members of the α-amylase family are not only hydrolases, but also transferases and isomerases.Based on amino acid sequence similarities, even some heteromeric amino acid transporter proteins may be considered to be the non-amylolytic members of the clan GH-H (JANECEK et al., 1997) (Fig. 4).
Enzymes that are members of the α-amylase family have to obey the following four criteria (KURIKI and IMANAKA, 1999;MACGREGOR et al., 2001;JANECEK, 2002;VAN DER MAAREL et al., 2002): (i) they act on α-glucosidic bonds (not only the α-1,4-and α-1,6-linkages); (ii) they employ the retaining reaction mechanism; (iii) they contain from 4 up to 7 conserved sequence regions; and (iv) they possess the same catalytic machinery within the catalytic TIM-barrel fold consisting of the aspartate residue near the end of the strand β4 (catalytic nucleophile), glutamate residue near the end of the strand β5 (proton donor) and aspartate residue near the end of the strand β7 (transition-state stabiliser).
The conserved sequence regions (Fig. 4) represent the short stretches of amino acid sequence that can be found in every α-amylase family member in equivalent positions and that contain the catalytic triad (Asp206, Glu230 and Asp297; Aspergillus oryzae α-amylase numbering; MATSUURA et al., 1984) and other functionally important residues (NAKAJIMA et al., 1986;JANECEK, 2002).These conserved sequence regions -common for the entire clan GH-H -may also be used as the sequence "fingerprints" since they contain amino acid residues exclusively specific for the individual enzyme specificities (JANECEK, 2008).
The α-amylase family members are multidomain proteins (Fig. 3a) containing the main catalytic domain in the form of a parallel (β/α) 8 -barrel (domain A) that is interrupted by a usually small domain in the place of the loop 3 connecting the strand β3 with the helix α3 (domain B) and succeeded by the antiparallel β-sandwich domain (domain C).The α-amylase-type of the barrel was confirmed in all members of the αamylase family whose three-dimensional structure has already been determined (Fig. 4).The (β/α) 8 -barrel of α-amylases was first revealed in the structure of Taka-amylase A (MATSUURA et al., 1984), i.e. in the structure of the α-amylase from Aspergillus oryzae.Since this type of fold was first identified in triose-phosphate isomerase (TIM), the (β/α) 8 -barrel is often simply called TIM-barrel (FARBER and PETSKO, 1990).It is a barrel of eight inner parallel β-strands surrounded outside by eight αhelices (Fig. 3a,b).The active site of these enzymes is localised at the C-terminal end of the TIMbarrel (MATSUURA et al., 1984, QIAN et al., 1993;KADZIOLA et al., 1994;LINDEN et al., 2003).Comparison of known tertiary structures of various α-amylase family members with sequence alignments have shown that differences in specificity result from different variation of substrate binding at the β->α loops (SVENSSON, 1994;JANECEK, 1997).Also the active-site cleft is not of the same shape in each case (KAMITORI et al., 1999;PRZYLAS et al., 2000), despite the fact it always contains the same catalytic triad accompanied, however, by several additional residues depending on a given enzyme specificity (MATSUURA, 2002).Differences especially in the length, sequence and secondary structure have also been seen within the domain B protruding out of the catalytic TIM-barrel in the place of the loop 3 (JESPERSEN et al., 1991(JESPERSEN et al., , 1993)).It was pointed out that these differences may be directly related to enzyme specificity (JANECEK et al., 1997).With regard to domain C succeeding the catalytic TIM-barrel, this domain could contribute to the overall catalytic domain stability by shielding the hydrophobic residues of the barrel (KATSUYA et al., 1998).
As far as the conserved sequence regions of the α-amylase family are concerned (Fig. 4), four of them (the regions I, II, III and IV) belong to the best known regions established more than 20 years ago, whereas the three additional ones (the regions V, VI and VII) were identified more recently.The former regions (FRIEDBERG, 1983;NAKAJIMA et al., 1986;MACGREGOR et al., 2001), positioned near the C-termini of the β-strands β3, β4, β5 and β7 of the catalytic TIM-barrel, contain most of the functionally important residues including the catalytic triad (Fig. 4).The latter regions (JANECEK, 1992(JANECEK, , 1994a(JANECEK, ,b, 1995(JANECEK, , 2002)), located near the C-terminal end of domain B and of β-strands β2 and β8, cover the features distinguishing the individual enzyme specificities from each other.Even the absence of the fifth conserved sequence region, for example, may be used as a feature characteristic of a given specificity (JANECEK, 2000).
Although the basic arrangement of the α-amylase family members is the same counting the three domains A, B and C (Fig. 3a), it should be taken into account that there are some family members that contain additional C-and/or N-terminal domains, for example cyclodextrin glucanotransferase (KLEIN and SCHULZ, 1991) and neopullulanase (HONDOH et al., 2003).They may play various and still not completely recognised functions, but most of them have been anticipated to be involved in binding starch (glycogen, pullulan) and related substrate analogues.These non-catalytic domains were in many cases confirmed to have this property and thus have been called starch-binding domains (PENNINGA et al., 1995;SORIMACHI et al., 1997).It was found that starch-binding domain disrupts the starch surface and thus increases the effect of the amylolytic hydrolysis (SOUTHALL et al., 1999).Within the CAZy server (Fig. 1), these motifs have been classified into the CBM (carbohydratebinding module) families (CANTAREL et al., 2009).At present, nine families of starch-binding domains are known: CBM20, CBM21, CBM25, CBM26, CBM34, CBM41, CBM45, CBM48, and CBM53.The motifs from the family CBM20 belong to most intensively studied starch-binding domains (SVENSSON et al., 1989;JANECEK and SEVCIK, 1999;RODRIGUEZ-SANOJA et al., 2005;MACHOVIC and JANECEK, 2006a).Based on a detailed bioinformatics analysis it was suggested to establish a common CBM clan from the families CBM20 and CBM21 (MACHOVIC et al., 2005) and the motifs classified recently into the families CBM48 and CBM53 could also join the proposed CBM clan (MACHOVIC andJANECEK, 2006b, 2008).

Glycoside hydrolase families GH70 and GH77
The family GH70 contains the sucrose-utilising glucosyltransferases (glucansucrase and alternansucrase) that possess a circularly permuted version of the α-amylase-type catalytic TIM-barrel (MACGREGOR et al., 1996).The first element of the GH70-type barrel is the α-helix equivalent to helix α3 of the α-amylase-type TIM-barrel, whereas the last element is the β-strand equivalent to strand β3 of αamylases (Fig. 5).This means that instead of E1-H1-E2-H2….E8-H8 present in αamylases (and overall in both the families GH13 and GH77), in GH70 glucosyltransferases there is H3-E4-H4-E5….H2-E3, where E and H stand for βstrand and α-helix, respectively (MACGREGOR et al., 1996).The glucansucrases are usually large multidomain proteins occurring exclusively in lactic acid bacteria ( VAN HIJUM et al., 2006)., 2008) that has confirmed the previous predictions concerning the circular permutation (Fig. 5).The solved structure interestingly revealed that the enzyme adopts the so-called "U-fold" domain arrangement so that 4 of the 5 domains are formed by combining an N-and a C-terminal part of the polypeptide chain (DIJKSTRA et al., 2007).
The interest in the family GH77 was recently increased by revealing the putative amylomaltases from a few borreliae that exhibited in their amino acid sequences the non-GH77 features (GODANY et al., 2008).It was especially the arginine positioned two residues before the catalytic nucleophile in the conserved sequence region II (Fig. 4) that was recognized to be replaced naturally by a lysine in the GH77 amylomaltaselike protein from Borrelia burgdorferi (MACHOVIC and JANECEK, 2003).This arginine was otherwise considered to belong to the four residues conserved invariantly throughout the α-amylase family, i.e. the entire clan GH-H (JANECEK, 2002).The exclusive (i.e. the non-GH77) sequence features present in GH77-like proteins from borreliae have already been confirmed as well as it was determined that the B.
burgdorferi GH77 amylomaltase-like protein exhibits a typical amylomaltase activity, i.e. the enzyme catalyzes both the hydrolysis of maltooligosaccharides and formation of their transglycosylation products (GODANY et al., 2008).Based on the bioinformatics analysis of various GH77 real and hypothetical amylomaltases, some of the borrelial GH77-like proteins were suggested to exhibit an intermediary character within this family (JANECEK, 2008).

Glycoside hydrolase families GH31 and GH57
The families GH31 and GH57 are not the members of the clan GH-H, i.e. they do not belong to the α-amylase family in terms as it is widely accepted (MACGREGOR et al., 2001), but they both deserve some attention here since they contain similar enzyme specificities (α-amylase, α-glucosidase, amylopullulanase, 4-αglucanotransferase, branching enzyme, etc.).
As far as the family GH57 is concerned, it contains several enzyme specificities that are also members of the main α-amylase family, only the α-galactosidase (EC 3.2.1.22)being different (JANECEK, 2005;MURAKAMI et al., 2006).It also employs the retaining mechanism, but due to a different catalytic domain -an incomplete version of a TIM-barrel, i.e. a (β/α) 7 -barrel (Fig. 7b) and catalytic machinery (IMAMURA et al., 2003;DICKMANNS et al., 2006) -it should be evolutionarily more distantly related to GH13 than is the family GH31 (JANECEK, 1998).Moreover, GH57 exhibits its own conserved sequence regions (ZONA et al., 2004) that are different from those characteristic for the clan GH-H (JANECEK, 2002).

α-Amylases from archaebacteria and plants
At present it is well-known and accepted that plant and archaeal α-amylases from the family GH13 are sequentially similar and evolutionarily related.This remarkable finding was first observed ten years ago (JANECEK et al., 1999;JONES et al., 1999).Before the first GH13 α-amylases from Archaea became available, the plant αamylases were positioned in the evolutionary tree (Fig. 8) on a branch next to the cluster of bacterial liquefying and intracellular α-amylases represented by bacilli and enterobacteria, respectively (JANECEK, 1994b).

Similarities and differences
The first detailed bioinformatics study focused on the archaeal α-amylases and their counterparts from a wide spectrum of remaining living organisms from Bacteria and Eucarya revealed (JANECEK et al., 1999) that the sequence features exclusive for the α-amylases from hyperthermophilic archaeons are present also and almost only in the plant α-amylases (Fig. 9).These features are as follows (JANECEK et al., 1999): (i) Ile107 (Thermococcus hydrothermalis α-amylase numbering; LEVEQUE et al., 2000a) succeeding the conserved aspartate in the conserved sequence region region I (strand β3); (ii) (Ala194)-Trp195 at the beginning, Tyr199 in the middle and Gly202 at the end of the region II (strand β4); (iii) Ala219 succeeding the conserved tryptophane and Tyr223-Trp224 succeeding the catalytic proton donor (Glu222) in the region III (strand β5); (iv) Ala286 in the region IV (strand β7); (v) Ile196 in the region V (located within the loop3, i.e. domain B); (vi) Ile42 succeeding the conserved glycine at the beginning and dipeptide Pro48-Pro49 at the end of the region VI (strand β2); and (vii) Gln309 succeeding the conserved glycine at the beginning, tripeptide Ile312-Phe313-Tyr314 in the middle and Asp316 at the end of the region VII (strand β8).It is worth mentioning that some of the above-mentioned residues have already been recognised as functionally important residues (KADZIOLA et al., 1998;LINDEN et ., 2003).Thus for example the glycine from the region II (Gly202 of the archaeal αamylase) serves as a specific ligand for calcium ion and the tryptophane from the region III (Trp224 of the archaeal α-amylase) forms a stacking interaction with one of the acarbose rings bound in the active site in the complex structure of barley αamylase with acarbose (KADZIOLA et al., 1998).These residues should play the same roles in the structure of the archaeal α-amylase from Pyrococcus woesei (LINDEN et al., 2003).
The close sequence similarity between the α-amylases from Archaea and plants has evoked the idea on a possibility to reveal the factors responsible for the high thermostability of the archaeal α-amylases that exhibit the temperature optima around and above 80 o C (LEVEQUE et al., 2000b;BERTOLDO and ANTRANIKIAN, 2002).The plant enzymes are generally substantially less thermostable.It is worth mentioning that on the one side the archaeal and plant α-amylases contain the common sequence features that discriminate them from the remaining sources, but on the other side they have to possess the additional sequence features that should enable one to distinguish them from each other, e.g., the alanine from the region IV (Ala286 of the archaeal α-amylase) that has no correspondence in the plant counterparts (Fig. 4).Such specific differences could be utilized in an effort to identify the molecular basis of high thermostability of the archaeal α-amylases via the approaches of site-directed mutagenesis and protein design.

Evolutionary relatedness
The close evolutionary relatedness of the α-amylases from Archaea and plants from the family GH13 is shown in Figure 10.The GH13 as one of the largest GH families (CANTAREL et al., 2009) has recently been divided into the subfamilies (STAM et al., 2006), the plant and archaeal α-amylases being placed into the subfamilies GH13_6 and GH13_7, respectively.With regard to the α-amylases most closely related to those from plants and Archaea (Fig. 10), these are the bacterial enzymes from Bacillus licheniformis (YUUKI et al., 1985) and Escherichia coli (RAHA et al., 1992) that represent the liquefying and intracellular α-amylases, respectively, as observed originally (JANECEK, 1994b).It should be noted, however, that the close evolutionary relationships between the α-amylases from Archaea and plants illustrated here only for a limited sample of living organisms (Fig. 10) has been confirmed also in the more recent evolutionary trees comparing a wider spectrum of taxonomic sources including novel groups of α-amylases from bacteria (DA LAGE et al., 2004) and fungi ( VAN DER KAAIJ et al., 2007).

Fig. 2 .
Fig. 2. (a) Retaining reaction mechanism of glycoside hydrolases (MACGREGOR et al., 2001).The proton donor protonates the glycosidic oxygen and the catalytic nucleophile attacks at C1 leading to formation of the first transition state.The catalytic base promotes the attack of the incoming molecule ROH (water in hydrolysis or another sugar molecule in trasnglycosylation) on the formation of the covalent intermediate resulting in a second transition state, leading to hydrolysis or transglycosylation product.(b) Inverting reaction mechanism of glycoside hydrolases(SAUER et al., 2000).The catalytic base (top) and acid (bottom) in the water-assisted hydrolysis of substrate leading to inversion of the configuration of the anomeric carbon.

Fig. 4 .
Fig. 4. Sequence fingerprints of the α-amylase family members.One representative of each enzyme specificity is presented.The catalytic triad is highlighted in yellow and signified by asterisks.The other functionally important residues corresponding with His122, Arg204, and His296 of α-amylase are also coloured.The well-conserved aspartate (beginning of the strand β3) is signified by black-and-white inversion.The residues conserved in at least 50% of sequences are coloured with grey background.The representatives of heteromeric amino acid transporter proteins (HATs) are also shown.The 'Year' denotes the year of three-dimensional structure determination (if any).Adapted from JANECEK (2002).

Fig. 5 .
Fig. 5.The arrangement of the secondary structure elements in GH70 with respect to GH13 α-amylase type TIM-barrel.(a) Typical "ordinary" TIM-barrel present in the members of the family GH13 (and also GH77); (b) circularly permuted version of the family GH70.The helices are represented by black rectangles and the strands are shown as arrows.The order of the helices in the GH13 (and GH77) is 12345678 from the Nterminal end of the protein, whereas in the GH70 the order is 34567812.Adapted from MACGREGOR (2005).

Fig. 9 .
Fig. 9. Sequence fingerprints of α-amylases.The enzymes represent the individual taxonomic sources with focus on Archaea and plants.The sequence features characteristic of the archaeal α-amylases are highlighted in black-and-white inversion.The catalytic triad is signified by asterisks and yellow highlighting.Adapted from JANECEK (2008). al

Table 1 .
The members of the α-amylase family (clan GH-H).
a HATs means the heteromeric amino acid transporter proteins.Adapted from JANECEK (2009).