IMGT genome database (classifies the immunoglobulin (IG) and T cell receptor (TR) genes from vertebrate species

The amino terminal regions of H and L chains are characterized by extensive variability in the amino acid sequence. These parts of the molecule are referred to as variable (V) regions and are responsible for recognition of the antigen. By contrast, the carboxy-terminal part of the immunogloublin, referred to as the constant (C) region, is less variable and differs only between distinct immunoglobulin classes and subclasses. Adaptive immunity relies on sophisticated genetic mechanisms to create the diversity of antibody V during the liefespan of a healthy human. Two distinct processes contribue to the genetic diversification of V regions: V, D (diversity) and J (joining) recombination and somatic hypermutation. Theoretically, under physiological conditions, the human immune system can generate BCRs with 10 to the 26 distinct sequences, an astronomical number that is far greater than the calculated number of all B cell clones that can be generated (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunogloy, 10, June 2019). 

Antibodies bind their targets using diversified loops, termed complementarity-determining regions (CDRs) with three in each rearranged VH and VL gene. CDRs 1 and 2 are encoded by germline V genes, while CDR3s in both VH and VL are the product of gene recombination. (D’Angelo “amny Routes to an Antibody Heavy-Chain CDR3: Ncessary, yet insuffiicent, for specific binding” Frontiers in Immunology, March 2018). 

The development of the antibody repertoire involves the recombination of individual members of two or three groups of germline segments, nucleotide insertion or deletion at these junctions, and somatic mutation of rearranged genes.

Antibody architecture accommodates a wealth of structural diversity. Heavy and light chain variable domains (VH and VL) each consist of a beta-sheet scaffold, surmounted by 3 antigen binding lopps (CDRs) of different lenghts which are fleshed with a variety of different side chains. The structural diversity of the loops can create binding sites of a variety of shapes ranging from almost flat surfaces to deep cavities (Tomlinson, J Mol Biol 1992, 227, 776-798). A diverse repertoire of V genes that encode the VL and VH domains is produced by the combinatorial rearrangement of gene segments that are drawn from pools of merderate size. (Tomlinson, EMBO Journal, 14(18), 4628-4638).

Spatially separated residues of H and L chains are involved in the formation of the antigen binding site. The structure of this site includes the hypervariable regions fo V domains, but not completely; the surface of the antigen binding site includes only about 23% of the surface of the CDRs. The antigen binding capacity of the V domains of separate H and expecially L chains is much weaker than the capacity of the active center formed by their combination. Since immunoglobulin monomers have two paris of H and L chains, two antigen bidning sites are foremd. Although the structures of V domains and binding regions are substantially similar, variation of amino acids involved in the formation of the antigen binding cavity results in chaing fine details of its configuration. 

The process by which antibodies are formed is a complex one which involves a set of reactions whenever a foreign material, such as an antigen, is introduced to a functioning inmune system. Mature B cells are triggered to produce antibodies via their interaction with antigen and helper T cells. These antibody molecules consist of a light and heavy chain, and are coded for by genes present in the mammalian genome. Every light chain is coded for by 3 distinct gene segments – the VL, JL and CL segments, while heavy chains are coded for by 4 segments – VH, DH, JH and CH.

The variable region of light chains is coded for by the VL and JL segments, whereas the variable region of heavy chains is coded for by VH, DH and JG segments. A number of different genes exist for each segment, However, it is the shuffling and rearrangement of these genes which leads to the tremendous number and diversity of antibodies, an estimated 106-107 antibodies. In addition, at all three points at which the variable gene segments are joined (i.e., the VH-D, D-JH and VL-JL juctures, substantial sequence variability is possible for any given pair of assembling segments. This junctional variability, together with the combinational diversity, can lead to an estimate 1012 different antibodies. To add to the complexity and diversity of antibody generation, when an antiobdy response to a T cell dependent antigen is mounted, those B cells with antibodies capable of engaging the antigen profilerate to form large clones and members of the B 

Cell clones diversify their variable genes by a hypermutation mechanism. This process, called somatic mutation, results in point nucleotide substitutions in the assembled antibody genes expressed by clones of immune participating B lymphocytes. Often these nucleotide substitutions result in amino acid replacements in the encoded antibody variable regions.

Sequence Variability at a DNA/Gene Level

The adaptive immuen system cells (B and T lymphocytes) are selectively activated by the specific recognition of an antigen via the variable region of their surface T cell receptors (TCRs) and B cell receptors (BCRs), respectively. These receptors undergo sequential mechanisms to maximize diversity; this enable a potentailly specific response to a wide range of antigens. In the primary lymphoid organs, H and L chain genes (the H chain genes are located on chromosome 14, while kappa and lambda light chain genes are located on chromosomes 2 and 22, respectively) and alpha and beta chain genes (cloated on chromosomes 14 and 7, respectively) encode the transmembrane antigen heterodimer receptors. These receptors underog the first step in a coplex michanism of antigen receptor gene rearrangement, known as somatic (V(D)J recombination. This highly regulated mechanisms combines one variable, one Diverse and one Joining gene segment to generate a unique single gene (V(D)J that will encode for a unique BCR or TCR variable region. In B cells, a further step of diversificaiton can occur by somatic hypermutaiton of the V region to generate BCRs with high affinity antigen binding sites. (Monsuro, “Next generation sequencing: new tools in immunology and hematology” Blood Res 2013, 48: 242-9). 

The human antibody gerline repertoire has been completely sequenced. There are about 50 functional VH germline genes located on chromosome 14 which can be grouped into six sub families according to sequence homology. About 40 functional VL kappa genes comprising several subfamilies are located on chromosome 2 and about 30 function VL lambda genes grouped into the sub families found on chromosome 22. The groups vary in size from one member (e.g., VH6 and Vk4) to up to 22 members (VH3) and the members of each group share a high degree of sequence homology. By comparing rearranged sequences of human antibodies with their germline counterparts, many human germline genes are never or only very rarely used during an immune response. (Knappik, J. Mol. Biol. 2000, 296, 57-86). 

Sequence variability (generation of immune diversity) of the VH and VL chains is concentrated in several hypervariable regions which also form the antigen binding site of the antibody molecule. This antigen binding site is complementary to the structure of the epitope and these regions are sometimes referred to as “complementarity-determining regions (CDRs).” The CDRs are interspersed with regions that are more conserved, termed framework regions and, with the CDRs from the other chain, contribute to the formation of the antigen binding site. Each light chain variable region (LCVR) and heavy chain variable region (HCVR) is conmposed of three CDRs and four FRs, arranged from amino terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, VF4. The three CDRs of the light chain are referred to as LCDR1, LCDR2 and LCDR3 and the three CDRs of the heavy chain are referred to as HCDR1, HCDR2, and HCDR3. The CDRs contain most of the residues which form specific interactions with the antigen. The framework regions of virtually all of the antibodies can be superimposed on one another. Only the CDR loops show different orientations through high resolution xray crystallography.

Typically the VL will include the portion of the light chain encoded by the VL and JL (J or joining region) gene segments and the VH will include the portion of the heavy chain encoded by the VH and DH (D or diversity region) and JH gene segments. The base of the Y is sometimes referred to as Fc (Fragment, crystallizable) region and is composed of two heavy chains that contribute two or three constant domains depending on the class of the antibody.

The process for generating DNA encoding the heavy and light chain immunoglobulin sequences occurs primarily in developing B-cells. Prior to the rearranging and joining of various immunoglobulin gene segments, the V, D, J and constant (C) gene segments are found generally in relatively close proximity on a single chromosome. During B cell differentiation, one of each of the appropriate family members of the V, D, J (or only V and J in the case of light chain genes) gene segments are recombined to form functionally rearranged variable regions of the heavy and light immunoglobulin genes. This gene segment rearrangement process appears to be sequential. First, heavy chain D to J joints are made, followed by heavy chain V to DJ joints and light chain V to J joints. In addition to the rearrangement of F,D and J segments, further diversity is generated in the primary repertoire of heavy and light cahins by way of variable recombination at the locations where the V and J segments in the light chain are joined and where the D and J segments of the heavy chain are joined. Such variation in the light chain typically occurs within the last codon of the V gene segment and the first codon of the j semgent. Similar imprecision in joining occurs on the heavy chain chromosome between the D and JH segments and may extend over as many as 10 nucleotides. Furthermore, several nucleotides may be inserted between the D and JH and between the VH and D gene segments which are not encoded by genomic DNA. The addition of these nucleotides is known as N region diversity. (Flynn, US2005/0255552).

The rearrangement of gene segments is mediated by recombination activating gene (RAG) 1 and 2 products. (Sturken, EP 2098536)

HCDR3 Region

Because the immunoglobulin H gene locus contain V, D and J gene segments, the coding region for a VH domain of an antibody requires two sequential V(D)j rearrangement events, whereas the immunogloublin gene locus lacks D gene segments and the VL coding region is generated by one V to J rearrangement. Thus, the junctional diversity that is generated by V(D)J recombination in the CDR3 region of IgH chains is greater than the CDR3 junctional diversity that is generated by only one rearrangement event for the IgL chains. In addition, at the early B cell differentiation stages, during which IgH chain gene rearrangements occur, the enzyme terminal deoxynucleotidyltransferase (TdT) is epxressed, that is able to add non-templated nucleotides at the D to J and V to J junctions (so-call N-sequence diversity), additionally diversifying the IgH CDR3 repertoire. In contrast, the CDR 3 repertoire of the IgL chain, which is formed upon V to J gene segments joining later during B cell differentiation, when TdT expression is largely downregulated, is somewhat less complex. (Sturken, EP 2098536).

Compared to other CDCr, the varied lenght and biochemical properties of heavy-chain complementary-determinign region 3 (HCDR3) contribute to enhanced sequence diversity. It has been estimated that the theoretical HCDR3 diversity exeeds 10 to the 15th varients, generated from fixed genomic sequencs by combinatorial and junctional diversificaiton mechanisms. The fully assembled V(D)j gene and its incorporated HCDR3 are derived from the sequential random assembly of 56 VH, 23 DH and 6 JH genes. While both VH and JH contribute to the HCDR3, the DH forms the central core. Alhtough DH genes are predomeintanlty read in one frame, all three frams can be used, further increasing potential diversity. In addition to the structural variability of HCDR3 with different sequences, the same HCDR3 can adopt different conformations within the same antibody bound to different targets or in uncomplexed antibodies with different VH//VL frameworks. Yet despite the imporant of HCDR3, HCDR3 is necessary, but insufficient for specific antibody binding. While the same identical HCDR3 sequenes can be generated by many different rearrangements, the specific target binding is an outcome of unique rearrangements and VL pairing.  (D’Angelo “amny Routes to an Antibody Heavy-Chain CDR3: Ncessary, yet insuffiicent, for specific binding” Frontiers in Immunology, March 2018). 

Human Germline V segment repertoire

In the human Ig H chain gene locus, there are an estimated 51 diffferent functional VH gene segments, and each of these genes has been assigned to one of seven different VH gene families. Genes are assigned to the same family if they possess greater than 80% DNA sequence homology. The greatest conservation of sequence within a family resides in the framework subdomains. Further similarities between members of particular VH families allow for clustering of related families into clans. All known mammalian, amphibian and teleost VH genes can be assigned to one of only three VH clans. Silverman (US2006/0205016)

In humans, the heavy chain V gene is assembled from a relatively small number of basic building clocks: 51 VH segments, about 30 D segments and 6 JH segments. The samll size of the germline repertorie emphasizes the importance both of junctional diversity and the association with different light chain molecules in producing a diverse repertoire of antibodies. Human VH segments can be classified into seven families, VH1-VH7, with different members of the same family being at least 80% homolocous at the nuelcotide sequence level. (Cook, “the human immunoglobuilin VH repertoire, Immunology Today, 5, 1995).

There are 51 germ line VH genes in humans and each of these can be recombined. There are 40 V kappa genes and 31 V lamda genes. The VH germ line genes are subdivided into 7 subclasses (VH1-VH7) and the germ line light chains are subdivided into 16 subclasses (Vkappa1-Vkappa6 and Vlambda1-Vlambda10.  In addition, there are stable allelic variants for most of these V segments, but the contribution of these variants to the structural diversity of the germline repertoire is limited. The sequences of all human germ line V segment genes are known and can be accessed in the V base database, provided by the MRC Centre for Protein Engineering, Cambridge, United Kingdom. Germline human V gene sequences can be cloned from human genomic DNA by PCR or linear amplification methods in the same way that rearranged and somatically mutated V gene sequences are cloned form cDNA. For example, degenerate primers encoding all germline Framework 1 amino terminal sequences (not included signal peptide leaders) and all Framework 3 carboxyl terminal sequences can be used for ligation to CDR3. After cloning, selection for intact reading frames, sequence verfication, and archiving, the repertoires can be used for assembly of combinatorial human V region libraries (Flyn, US2005/0255552)

On the basis of nucleic aid sequence homology, the VH genes have been grouped into 6-7 families. Among the seven families, teh VH3 family is the largest, and consists of 22 functional genes. The VH1 and VH4 families each contains about a dozen functional genes, and VH2, VH5, VH6 and VH7 families contain 3, 2, 1 and 1 functional genes, respectively. (Kohsaka “the human immunogloublin VH gene repertoire is genetically controlled and unaltered by chronic autoimmune stimulation” J. Clin. Invest., 1996). 

V segment variants generated by somatic hypermutagenesis (see below) during the affintiy maturation process may also make important contributions to the V segment repertoire, since these mutations appear to be non random, and may confer structural adjustments which facilitate high affinity antigen specificity.

Class Switching

B cell development initiates in the bone marrow with a deletional recombination betweena  D and J gene. Subsequently, a V gene recombines with the DJ to make a VDJ, which is transcribed, producing a splieced VDJCu trasncript. If the transcript is in frame, then a u chain is synthesied upon translation. Similarly, and generally after VHDJH recombination and successful paring of the u chain with surrogate light chain, the Ig L chain loci rearrange their V and J gene segments. Successful B cell development in the bone marrow results in B cells expressing IgMk or IgMlamda on the cell surface. These IgM producing B cells form the primary immune repertoire and perform immune surveillance for recognition of foreign antigens. They can subsequently undergo isotype class switching from IgM to IgG or IgA or IgE isotypes. Different isotypes have different effector functions. For example, human IgG1 and IgG3 isotpyes are invovled in complement mediated lysis or ADCC and the IgG2 and IgG4 have little or no known funciton. Class switch recombination from IgM ot IgG, IgA or IgE is mediated through a deletional recombination event occurring between tandem directly reepetive switch regions present 5′ of IgH constant region genes. Switch regions are known to be composed of the I promoter, the I exon and a set of direct repeats flanked by inverted repeat sequences. Enahnces and cytokymatically catalysed by membrane-associated tryosylprotein sulfotransferases that are localized in teh Golgi apparatus. Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunogloy, 10, June 2019).ine response sequences are known ot lie in the region near the I promoter. Different substances are known to affect class switching. For example, the combination of LPS and IL-4 in vitro indues class switching to IgG1 and IgE and suppresses switching to IgG2b and IgG3. Green (US 2003/0093820)

Green (US 2003/0093820) discloses fully human antibodies in a transgenic animal which include a human constant region gene segment that includes exons encoding a desired H chain isotype operably linked to switch egments from a constant region of a different H chain isotype. The transgenes include a DNA sequence identical to the DNA sequence of human chromosome 14 starting at least from the first D segment of the H chian locus through the J segment genes and the constant region genes through Cu of that locus. The transgenes are operably linked to a capable of isotype switching to an additional constant region segment. The constant region coding segment is operably linked to a switch region that is not normally associated with it. In one embodiment, a transgene of the invention is intorduced into an embryonic stem (ES) cell which is then inserted into a blastocycste which is then surgically inserted into the uterus of a non-human animal to produce a chiermic non-human animal. 

Somatic Mutation

In the bone marrow, the body makes millions of different versions of B cells by rearranging antibody genes. These cells are referred to as “naive B cells” because they have not yet encountered antigen. Once the B cells come into contact with antigen in lymphoid organs such as the lymph nodes, they start to multiply and beign undergoing a process called “somatic hypermutation” which causes additional mutations in the antibody genes. In other words, when an antibody response to a T cell dependent antigen is mounted, those B cells with antibodies capable of engaging the antigen proliferate to form large clones. In addition, members of the B cell clone diversify their variable genes by a hypermutation mechanisms. The time period during which mutational diversification occurs is not known, although it is known that somatic mutations are acquired at some stage of the primary immune response, and possibly during secondary and later response. The B cells with mutations that result in the most improved binding to antigen multiply again and start another round of somatic hypermutation. These repeated cycles of somatic hypermutation and selection result in increasinly affinity matured antibodies. It is known that recuitment into the memory B cell compartment of the immune repertorie is strongly correlated with acquisition of specific somatic mutaitons and combinations which confer upon antibody product increased affinity for immunizing antigen.

The somatic mutation process introduces point nucleotide substitutions in the assembled antibody genes expressed by clones of immune participating B lymphocytes, often in the encoded antibody variable region. In this way, the antibodies expressed by different members of a mutationally active b cell clone may differ in variable region sequence and potentially in binding site structure and function.

Importance of Activation-induced deaminase (AID) The somatic mutations are specifically targeted to the VH and VL coding regions, and are mediated by the B lineage specific enzyme activation-induced cytidine deaminase (AID). As a consequence of somatic hypermutation occurring during immunization, cells expressing higher affinity antibody mutants against the immunogen are positively selected in the course of an immunizaiton mostly occuring in germinal centers, and resulting in an enrichment of cells producing higher affinity antibodies. The AID mediated diversification and specific targeting to the VH and VL coding regions is significantly increased by the presence of cis-regulatory genetic elements In particular enhancer elements of the IgH and IgL chain gene locus, located in the proximity of the rearrange VH and VL coding regions. (Sturken, EP 2098536). 

AID not only has an important function in causing somatic hypermutation but also is important for class switch recombination for chaging the class of the constant region of an antibody as well as gene conversion. PI3 Kinase is known as a factor functioning upstream of AID. PI3 Kinase phosphorylates the position 3 of the inositol ring of phopatidylinositol, and plays an important role in various cellular functions such as cell survival, cell growth, cell motility and the transport of intracellular organelle. As a reuslt of studies using a PI3K inhibitor in mouse B cells, it has been demopnstrated that when p110epsiolon signaling is suppressed, the expression of AID is increased and class switch recombination is promoted. (Niikura, US Patent Application No: 16/484,061, published as US 2019/0359691). 

Niikura, (US Patent Application No: 16/484,061, published as US 2019/0359691) disclsoes a methdo for promoting diversificaiton of the amino acid sequences of variable regions of an antibody generated by an avian B cell population which includes suppressing PI3Kalpha activity of each avian B cell in a popluation expressing the antibody. 

 Diversification is regulated by the B cell specific enzyme Activation-Induced Cytidine Deaminase (AID):

AID is a small prtoein of 200 amino acids and 24 kDa. AID has the ability to mutate highly transcribed genes indepenent from sequence and position . To avoid genome wide DNA damage it has to be conetrolled tightly. AID gene expression in mouse an human is induced by factors that medaite germinal center B-cell activations like IL-4 or CD40 ligand. IL-4 and CD40L act synergistically, presumabley thorugh activaiton of specific signal transduction pathways and activator of transcription 6 (STAT6) and NFkB. IL-4 induce STAT6 binding to a site upstream of the promoter of the AID gene and CD40L induced binding of NFkB to two promtoer sites located in the same region. AID remains restricted to the cytoplasm of lyphocytes where it has no possibioity to influence target genes until activation. According to a curent model, ID initiates diversificaiton processes by the deamination of deoxycytidine to uracil. Uracil is further processed by the uracil DNA glycosylase thereby creaitn an abasic site. (Ulrike Schotz. “Diversificaiton of the immunoglobulin genes: analysis of the molecular mechanisms in the chicken B cell line DT40” Disseration, 2009).

Unconventional antibody Diversificaiton

In addition the conventional genetic mechanisms used to geenrate antibdoy diversity, the immune system uses several alternative strategies to broaden the antibody repertoire. These include insertion of non-immunoglobulin sequences in the variable region, post-translational modificaiton of the variable region, conformational heterogeneity of the antigen-binding site and use of nonprotein cofactors for antigen recognition (metal ions or haem). Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” There are many types of post-translational modificaiton, the most prominent being glycosylation, phosphorylation, lipidation and sulfation. In addition to C region mofidication, a fraction of human antibodies can udnergo post-translational changes that are localized in teh V region and thus might directly influence antigen binding. Immunogloy, 10, June 2019).

Tyrosine sulfation: O-sulfation of tyrosine is a post translational modificaiton present in more than 200 human prtoeins. It affects mostly secretory and membrane-boudn proteins with examples including coagulation factors and G protein-coupld receptors. The sulfation of tyrosine residues is ezymatically catalysed by membrane associated tyrosylprotein sulfo-transferases that are localized in the Goldi apparatus. Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunogloy, 10, June 2019).

A group of HIV-1 specific antibodies recognized an epitope on the viral envelope protein complex that is displayed only upon interaction of viral gp120 with CD4 on host cells. Accordingly, these antibodies are referred to as CD4-induced antibodies. Characterization of CD4i monoclonal antibodies isolated from people infected with HIV-1 revealed a peculiar feature of some of these antibodies -they have a sulfotyrosine in their CDR H3. The presence of tyrosine sulfation was demonstrated to be critical for binding to gp 120 and for virus neutralization by some CD4i monoclonal antibodies Mutation of tyrosine to phenylalanine or silencing of tyroslprotein sulfotransferasesin antibody expressing cells resulted in complete abrogation of gp120 binding and HIV-1 neutralization by the CD4i antibody. 

N-glycosylation of the V region: In healthy humans 15-25% of IgG antibodies have N-linked glycan structures in their VH or VL regions. The frequency of antibodies with V bound glycan structures can increase considerably in some pathological conditions, such as rheumatoid arthritis and Sjorgen syndrome. Interestingly human antibody repertoires encoded by naive B cells are almost devoid of antigen-binding fragment (Fab) boudn glycans. The glycosylation sites are vitrually absent form the germline sequences and are introduced in V region genes predominantly as a consequence of the somatic hypermutation process. (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunology, 10, June 2019).

The most frequent glycan structure attached to V regions is a complex biantennary type enriched in terminal 2,6 -linked sialic acids. In contrast to the N-glycan attached to the Asn297 of the C region, V region glycans are considerably more exposed on he surface of the IgG molecule. The predominant localization of N-glycosylation sites in the antigen-binding site and the high exposure of the glycan suggest that this post-translation modification can influence the antigen-binding specificity of the antibodies. (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunology, 10, June 2019).

A panel of Fab-glycosylated human IgG monoclonal antibodies were mutated at the glycosylation sites back to the germline residues. In most IgGs, the mutations abrogating the glycosylation of the V region resulted in a significant decrease in the binding affinity for the target antigen. (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunology, 10, June 2019).

Addition of N-glycan has been sued for the rational engineering of antibodies. (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunology, 10, June 2019).

Use of metal ions. The antibody Q425 is a mouse monoclonal IgG that specifically recognizes an epitope in domain 3 of teh CD4 receptor and blocks the fusion of the HIV-1 envelope with the membrane of CD$-positive cells. The binding of this antibody to its eptiope was demonstrated to greatly depend on the presence of Ca2+ ions. (Kanyavuz, Nature Reviews, “Breaking the law: unconventional strategies for antibody diversification” Immunology, 10, June 2019).

Send an Email. All fields with an * are required.