Mechanisms

Proteases (also termed “proteolytic enzymes” “peptidases” or “proteinase”) attack their substrates using 2 approaches. Exopeptidases remove single residues or dipeptides sequentially from either the N-terminus or C-terminus depending on the specificity of the particular protease, whereas endopeptidases make one or mroe cleavages within the protein to break it down into smaller polypeptide fragments. A natural consequence of protein cleavage is the generation of a new N-terminus if the cleavage is carried out by an aminopeptidase, a new C-terminus if the hydrolysing enzyme is a carboxypeptidase, or a new N-termini and C-termini if an endopeptidase is involved. These termini are antigenically distinct form the intact protein. Because these new termini are antigenically distinct from the intact protein, it is possible to generate antibodies that react with such new epitopes can be generated. The general term “anti neoepitope antibody” has been used to describe such antibodies.

Applications and Functions

Proteolytic enzymes regulate protein processing and intravellular protein levels by removing abnormal and damaged proteins from the cell as well as play a defensive role against herbivores, inects and pests. In addition to their biological roles, proteases have also been exploited commercially in food, leath, texitle, and detergent industrities because of their broad substrate specificites and activities over a wide range of pH, temperature and other denaturing conditions. In the detergent industry, for example, proteases are used to improve the cleaning efficiency of laundry detergents. (Kumari, J. Agric. Food Chem. 2010, 58, 8027-8034).

Although necessary for proper function during inflammation and remodeling, proteases can cause serious tissue damage if improperly regulated. For example, extracellular matrix degradation is an underlying feature of many degenerative diseases, such as arthritis. Although the hydrolysis of peptide bonds is an integral part of most physiological and pathological processes, knowledge is often lacking as to which peptide bonds are cleaved, in which protein substrates, in which order, and by which proteolytic enzymes (Mort, J Clin Pathol, 1999, 52 11-18).

Common Proteases

Clostripain (AKA “endoproteinase Arg-C”): is a two chain proteinase that can be isolated from Clostridium histolyticum. It has been shown to have both proteolytic and amidase/esterase activity and has an optimal pH range of 7.6-7.9. Clostripain preferentially cleaves at the carboxyl group fo arginine residues, however the lceavage of lysyl bonds has also been reported.. While Clostripain has been shown to accept substrates contianing Lys instead of Arg, reaction rates are low in comparison to reactions with Arg contining substrates. Clostripain clevage is often used in biomedical and biotechnological applications. For example, clostripain cleavage include peptide mapping, sequence analysis, cell isolation, hydrolysis/condensation of amide bonds and peptide synthesis. Clostripain may be used in order to cleave of tags. 

Papainis a plant dervied sulfhydryl protease which is commonly used to digest IgG antibodies into either Fab or F(ab’2.)2 fragments, depending on whether L-cysteine is present or absent during the reaction, respectively. 

Introduction to CRISPR/Cas9

Companies: Intellia Aldevron (Danaher)

Non-profits: Innovative Genomic Institute  Innovative Genomics Insttitute Education Section  Biopku   

CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR assocaited) systems are prokaryotic adaptive immune systems that bind and cleave foreign nucleic acids. The most frequently used type II CRISPR system is composed of two components: Cas9 nuclease and an artificial single guide RNA (sgRNA), a fusion of a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA). When the SpCas9 sgRNA complex recognizes an NGG (N=A, T, C or G) proto-spacer-adacent motif (PAM) sequence, the spacer of the sgRNA pairs with the target DNA strand to form an “R-loop” structure. Subsequently, the Cas9 nuclease cleaves the DNA strands and produces a blunt-end DSB 3 bp upstream of the PAM into the photospacer. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

The Cas9 protein (CRISPR-associated protein 9), derived form type II CRISP (clustered regularly interspaced short palindromic repeats) bacterial immune systems are an RNA guided DNA endonucleae. Cas9 can be easily programmed to target new sites by altering its guide RNA sequence, and its development as a tool has made sequence specific gene editing much easier. The CRISPR system is an adaptive immune mechanism present in many bacteria and the majority of characterized Archaea. CRISPR containing organisms acquire DNA fragments from invading bacteriophages and plasmids before transcribing them into CRISPR RNAs to guide cleavage of invading RNA or DNA.  The Cas9 system has the potential to cure diseases. For example, when Cas9 is introduced into infected cells together with sgRNAs targets crucial viral genomic elements, it helps to inactive or clear the viral genome and thereby defends the cells from infections such as HIV, hepatits B virus, HPV and Epstein-Barr virus. (Wang, “CRISPR/Cas9 in genome editing and beyond” Annu. Rev. Biochem, 2016, 85: 22.1-22.38). 

CRISPR/Cas9 can be used for genome modification in eukaryotic cells. Guide RNAs are synthesized specific to the target location to be modified and they direct the Cas9 nuclease to make double stranded breaks. A nonhomologous end joining (NHEJ) repair mechaisms ligates the cut ends of the DNA nonspecifically, whereas a homology directed repair (HDR) mechanism repairs the double straned breaks using a donor template with homologous sequences to the cut site.  NHEJ based DNA repair is error prone and results in small nucleotide inertions and deletions that leads to subsequent frame shifts or deletion mutations and potentially to the loss of gene function. This repair pathway is used for engineering gene knockout or loss of function models. By contrast, HDR mediated repair is more accurate because it integrates a donor DNA with homology arms to the sequence at the SDB. It is thus used for site specific gene knock in and point mutation models. 

The conical Cas9 variant sources from Streptococcus pyrogenes (SpCa9) recognizes and binds to an NGG prospacer adjacent motive (PAM) and after binding cuts-3 nt upstream of the PAM, resulting in a blunt ended double strand break. Other naturally occuring Cas proteins (for exampe, SaCa9, NmCas9) offer smaller sizes and altered PAM specificites, which can be useful for viral packaging and expanding targeting ranges. Further, some Cas proteins can site-specifically target RNAs, notable Cas13 variants. In additional to repurposing naturally occurring CRISP-Cas platforms, the Cas9 protein has been engineered for imporved specificity, epxanded targeting ranges and to allow sequence modificaitons without DSBs. (Bashor, “Engineering the next generation of cell-based therpaeutics” Nature Reviews Drug Discovery)

 Zhang (US8,697,359) discloses a Type II CRISPR-Cas system having threee components: (1) a crRNA molecules called the “guide sequence”, (2) a “tracr RNA” called an “activator-RNA) and a protein called Cas9. To alter a DNA molecule, the system must acheive three interactions: (1) crRNA binding by specific base pairing to a specific sequence in the DNA of interest (“target DNA”), (2) crRNA binding by specific base pairing at another sequence to a tracr RNA and (3) tracr RNA interacting with a Cas9 protein which then cuts the target DNA at a specifi c site. The CRISPR-Cas9 system occurs naturally in bacteria (prokaryotes).

Parts to CRISPR/Cas9:

Guide (“Targetting” or “spacer”) sequence: refers to a sequence within a guide RNA that is complementary to a target sequence and functions to direct a guide RNA to a target sequence for binding or modification (e.g., cleavage) by an RNA-guided RNA binding agent. A guide sequence can be 20 base pairs in lenght such as in the case of Streptococcus pyogenes (i.e., Spy Cas9) and related Cas9 homologs/orthologs. The guide RNA can be designed to recognize (e.g., hybridize to) a target sequence of a particular gene. 

One critical component of CRISPR-Cas9 and other Cas-based technology is gRNA design. The DNA, unwound by Cas9, is probed by the gRNA sequence adjacent to the PAM. The first few nucleotides are referred to as the ‘‘seed’’ sequence, where Cas9 is most sensitive to mismatches. If there is full seed region complementarity, Cas9 is more likely to remain bound to the DNA and mediate cleavage, which can lead to off-target effects. Partial complementarity between Cas9–gRNA complexes and the seed region of DNA se-quence increase dissociation of Cas9,  such that single nucleotide changes, although having little impact on the DNA association, increase the Cas9–gRNA–DNA rates from <0.006/s to >2/s. The elucidation of these ki- netics has allowed for more precise gRNA designs to minimize off-target cleavage activity, critical for the devel- opment of tailored and effective CRISPR-Cas9 gene ther- apies. (Balderston, CRISP Journal, 4(3), 2021)

A guide RNA for a CRISPR/Cas9 nucelase system includes a CRISPR RNA (crRNA) and a tracr RNA (tracr). For example, the crRNA can include a targeting sequence that is complementary to and hybridizes with the target sequence on the target nucleic acid molecule. The crRNA may also include a flagpole that is complementary to and hybridizes with a portion of the tracrRNA to promote the formation of a functional CRISPR/Cas complex. In some embodiments, the flagpole can include all or a portion of the sequence (also called a “tag” or “handle” of a naturally occuring crRNA that is complementary to the tracr RNA in teh same CRISPR/Cas system.  In some embodiments, the guide RNA may include two RNA molecules referred to as a “dual guide RNA”. The dgRNA may include a first RNA molecule that includes a crRNA and a second RNA molecule that includes a tracer RNA. The first and second RNA molecules may form a RNA duplex via the base pairing between the flagpole on the crRNA and the tracr RNA. In some embodiments, the guide RNA may include a single RNA molecule referred to as a “single guide RNA”. The sgRNA can include a crRNA covalently linked to a tracr RNA. In some embodiments, the crRNA and the tracr RNA may be covalently linked via a linker.  Shah (WO 2017/173054)

In some embodimetns, nucleic acid, e.g., expression cassettes, encoding the guide RNA are included. In some embodiments, the nucleic acid may be a DNA molecule.. The nucleotide sequence encoding the guide RNA may be oeprably linked to at least one transcriptional or regulatory control sequence, such as a promoter, a 3′ UTR or a 5′ UTR. Shah (WO 2017/173054)

Target sequence: for Cas proteins include both the positive and negative strands of genomic DNA (i.e., the sequence given and the sequence’s reverse compliment), as nucleic acid substrate for a Cas protein is a double stranded nucleic acid. The target sequence has complementarity to the guide sequence of the gRNA. The interaction of the target sequence and the guide sequence directs an RNA-guided DNA binding agent to bind, and potentially nick or cleave (depending on the activity of the agent) within the target sequence. 

RNA-guided DNA binding agent (Cas Nuclease): means a polypeptide or complex of polypeptides having RNA and DNA binding activity, or a DNA binding subunit of such a complex, wherein the DNA binding activity is sequence specific and depends on the sequence of the RNA. Examples of RNA guided DNA binding agents include Cas cleavases/nickases and inactivated forms thereof (“dCas DNA binding agents”). “Cas nuclease” also called “Cas protein” encompasses Cas cleavases, Cas nickases and dCas DNA binding agents. In some embodimetns, the RNA guided DNA binding agent is a class 2 Cas nuclease. The RNA guided DNA binding agent has cleavase activity, which can also be referred to as double stranded endonuclease activity. The RNA guided DNA binding agent includes a Cas nuclease, such as a Class 2 Cas nuclease (which may be, e.g., a Cas nuclease of Type II, V, or V1), Class 2 Cas nucleases includes for example, Cas9, Cpf1, C2c1, C2c2 and C2c3 proteins and modificaitons therof. Examples of Cas9 nucleases include those of the type II CRISPR systems of S. pyogenes. In some embodiments, the RNA guided DNA binding agent has single strand nickase activity, i.e., can cut one DNA strand to produce a single strand break, also known as a “nick”. In some embodimetns, an mRNA encoding a nickase is provided in combination with a pair of guide RNAs that are compementary to the sense and antisense strands of the target sequence. In this embodiments, the guide RNAs direct the nickase to a target sequence and introduces a DSB by generating a nick on opposite strands of the target sequence (i.e., double nicking). In some embodiments, a nickase used together with two separate guide RNAs targeting opposite strands of DNA to produce a double nick in the target DNA. A RNA guided DNA binding agent may include a nuclear localization signal (NLS). (Kanjolia US 2020/0248180). 

In S. pyogenes, Cas9 generates a blunt ended doulbe stranded break 3 bp upstream of the prtospacer-adjacent mtif (PAM) via a process mediated by two catlytic domains in the protein, an HNH domain that cleaves the complementary strand of the DNA and a RuvC-like domain that cleaves the non-complementary strand. (Mali, Nat Biotechno. 2013, 31(9) 833-838). Cas9 nucleases, which are components of type II CRISPR-Cas systems, are RNA guided DNA endonucleases that induce DSBs at target sites. Cas9 has two distinct nuclease domains, HNH and RuvC, which cleave the target and non-target strand respectively. Inactivaiton of either nuclease domain creates a Cas9 nickage (nCas9) which cleaves only one DNA strand. Inactivaiton of both nuclease domains generates dead Cas9 (dCas9) which still binds to garget DNA. nCas9 is useful in base editors and rpime editors, which perform precise geneome ediitng without requireing DSBs and dCas9 serves as a scaffold for recruiting effectors proximal to specific genomic sites. dCas9 is widely used for regulating transcription, altering epigenetic controls, imaging living cells and other purpsoes. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

Template:  CRISPR-cas systems can also include a template nuclei acid used to alter or insert a nucleic acid sequence at or near a target site for a Cas nuclease. In some cases, the template may be used in homologous recombination which can result in the integration of the template sequence or a portion of the template sequence into the target nucleic acid molecule. In some cases, a single template may be provided but in other cases two or more templates can be provided such that homologous recombination can occur at two or more target sites. In other cases, the template may be used in homology-directed repair which involves DNA strand invasion at the site of the cleavage in the nueleic acid. In some cases, the homology directed repair can result in incluing the template sequence in the edited target nucleic acid molecule. In some caes, the template sequence may include an exogenous sequence such as a protein or RNA coding sequence operably linked to an exogenous promoter sequence such that, upon integration of the exogenous sequence into the target nucleic acid molecule, the cell is capable of expressing the protein or RNA encoded by the integrated sequence. In some examples, the exogenous sequence may provide a cDNA sequene encoding a protein or a portion of the protein. The target nucleic acid molecule may be DNA or RNA that is endogenous or exogenous to a cell. In some cases, teh target nucleic acid is an episomal DNA, a plasmid, a genomic DNA, viral genome, mtochondrial DNA or a chromosome from a cell or in the cell.

In cases involving a Cas nuclease, such as a Class 2 Cas nuclease, the target sequence may be adjacent toa protospacer adacent motif (“PAM”). In some cases, the PAM may be adjacent to or within 1-4 nucleotides of the end of the target sequene.. The PAM may be seelted from a consensus or a particular PAM sequence for a specific Cas9 protein. 

Modified CRISPR:

A group of researches have used a modified version of Cas9 bearing a D10A mutation that produces “nicks” in target DNA to produce a system they term EvolvR for dramatically increasing in vivo mutagenesis (Nature, Letter “CRISP-guided DNA polymerases enable diversification of all nucleotides in a tunable window”. This was shown in E. coli for producing cells resistant to the antibiotic streptomycin. The modified nCas9 is fused to the amino terminus a fidelity reduced variant (D424A, 1709N, A759R) of E coli DNA polyemrase I. The palsmid encoidng this contruct also encoded a guide DNA (gDNA) to target a second plasmid containing a homologous gene sequence. These expeirments shows mutations arising within a 17 nucleotide window 3′ of the nick site, consistent according to the known 15-20 nucleotide processiveity of the polymerase. Mutation rates using this system wehre further enhanced by making additional mutations in nCas-9 (K848A, K1003A, R160A); these mutations had been suggested to lower the non-specific DNA affinity of Cas-9. 

The Cas nuclease mRNA may be modified for improved stability and/or immunogenicity properties. The modificaitons may be made to one or more nucleosides within the mRNA. Examples of chemical modificaitons to mRNA ucleobases include pseudouridine, 1-mthyl-pseudouridine and 5-methyl-cytidine. The mRNA encoding a Cas nuclease may also be codon optimized for expression in a particular cell type such as a eukaryotic cell. The mRNA can also include a 5′ cap, such as m7G(5′)ppp(5′)N. The mRNA can also include at elast one element that is capable of modifying the intracellular half-life of the RNA. In some cases, teh element may be within the 3′ UTR of teh RNA. For example, the element may include a mRNA decay signal. The element may also include a polyadenylation signal.  Shah (WO 2017/173054)

Prime editors: are an alternative CRISPR-Cas based techbology that allow for the inesertion of amlls (<50 bp) sequences at a target stie. Prime editors realy on a CRISPR-Cas9 nickage, a reverse transcriptase and a modified gRNA. A Cas9 nickase is a modified Cas protein wehre one of the catalytic domains is inactivated via a mutation so that only a single DNA stran cut is made instead of cutting both strands. The modified gRNA, termed prime-editing guide RNA (pegRNA), serves two functions. First, it specifies the site for genome insrtion, and second, it provides a template for insertion into the gebome. The first IND cleared using prime editing techology is an ex vivo hematopoietic stem cell therapy wehre the prime editor is used to correct a mutation. (Christina Fuentes “Coming of age: an overview of the growing toolbox for gene editing and its use in CGT applications” Cell & Gene therapy insights, 2024; 10(9), 1221-1236). 

Thermostable Geneome Editors:

–Stearothermophilus Cas9 (GeoCas9); is stable in human serum and an efficient genome editors. In addition, by adjsuting the type of lipid used in the LNP, one can direct an RNP towrd different organs like the lungs in mice. GeoCas9 was mutated using directed evolution to optimize its ability to edit. The result was iGeoCas9 with more than 100 times the efficiency of its WT. While SpCas9 recognizes a imple NGG PAM, WT GeoCas9 requires a PAm sequence that is early 3 times as long (N4CRAA). Thus, compared to SpCas9, the editing options for GeoCas9 is much smaller. iGeoCas9, on the other hand, can recognize a elss stringent N4SNNA Pam. iGeoCas9 is more negatively charged than its peers. Thus, unlike SpCas9, it is less vulnerable to degradation by organic solvents. Thus it can potetially be delivered using LNPs whereas SpCas9 can not. The stability of iGeoCas9 based LNP-RNP complexes at 4C could significantly reduce the need for cold chain logistics –many at -70C, which are often prohibitively expensive and difficult to maintain in regions with limited infrastructure. (Hough, “Hope springs forth with therostable gebome editors” MRNA & Gene Editing, Drug Delivery, Oct 28, 2024). 

 

Delivery of CRISPR/Cas cargoes

Lipid Nanoparticles: 

Shah (WO 2017/173054) discloses lipid nanoparticle (LNP) based compositions for delivery of CRISPR/Cas editing components. The LPs include a plurality of lipd molecules physically assocaited with each other by intermolecular formes. The LNP compositions are preferentially taken up by liver cells (e.g., hepatocytes). The LNPs specifically bind to apolipoproteins such as apolipoprotein E (ApoE) in the blood. Apolipoproteins are prtoeins circualting in plasma that are key in regualting lipid transport. ApoE represent one class of apolipoproteins which interacts with cell surface heparin sulfate proteoglycans in th liver during the uptake of lipoprotein. The lipid compositions for delivery of CRISPR/Cas mRNA and guide RNA components to a liver cell include a CCD lip which can in some  ses be Lipd A, Lipid B, or Lipid D. The lipids may be ionizable depending upon the pH of the medium they are in. For example, in a slightly acidic medium, the lipids may be protonated adn thsu bear a positie charge. Covnersely, in a slightly basic medium such as blood wehre pH is about 7.35, the lipids may bear no charge. In some embodiments, the lipids may be protonated at a pH fo at elast about 9. The ability of a lipid to bear a charge is related to ints intrinsic pKa. For example, the lipids may have a pKa of from about 5l8-6.2 which may be advantageous in that cationic lipids with a pKa of about 5.1-7.4 are effective for elivery of cargo to the liver. Additional lipids can be included in the composition such as neutral lipids, uncharged or zwitterionic lipids. Helper lipids which enhance transfection of the nanoparticle can also be included. Stealth lipids that alter the lenght of time the nanoparticle can exist in vivo can also be included. In some examples, the LNPs are formted by mixing an aqueous RNA solution with an organic sovlent based lipid solution such as 100% ethanol. A buffer is sued to maintain the pH of the composition comprising LNPs for example at or aboue pH 7.0. For example, the LNPs can be formulated with a CCD lipid amine (e.g., Lipid A or Lipid B) to RNA phosphate (N:P) molar ratio of about 4.5. The lipid nanopartcile components are dissolved in 100% ethanol with CCD lipid, hleper lipid (e.g., chloesterol), neutral lipid (e.g., DSPC) nd PEG. The RNA cargo is dissolved in acetate buffer, pH 4.5 resutling in a concentraiton of RNA cargo of about 0.45 mg/ml. The LNPs are formed by micrfluidic mixing of the lipid and RNA sollutions using aPrecision Nanosystems NanoAssemblr. 

Patents

In August 2012, University of California researchers published an article (“Jinek 2012”) demonstrating that the isolated elements of the CRISPR-Cas9 system could be used in vitro in a non-cellular experimental enviornment. This led to UCs US patent Application No. 13/842,859, which relate to the use of CRISPR0-Cas9 system for the targetted cutting of DNA molecules. The system includes threee components: (1) a crRNA; (2) a tracrRNA dn (3) the Cas9 protein. The crRNA is an RNA molecule with a variable porition that tarets a particular DNA dsequence. The nucleotides that make up the variable porition complement the target sequence in the DNA and hybridize with the target DNA. Another porition of the crRNA consists of nucleotides that complement and bind to a portion of tracrRNA. The Cas9 protein interacts with the crRNA and tracrRNA and cuts both strads of DNA at the target location. (see Regents v. Broad Institute, Fed. Cir. 2018). 

Claim 165 of the application is as follows:

“A method of cleaving a nucelic acid comprising

contacting a target DNA molecule haivng a target sequence with an engineered and/or non-naturally-occurring Type II Clustered Regularly Interspeaced Short Palindromic Repeats (CRISPR) – CRISPR associated (Cas) CRISPR-Cas system comprising

a) a Cas9 protein; and 

b) a single molecule DNA-targeting RNA comprising  (i) a targeting-RNA that hybridizes with the target sequence, and (ii) an activator-RNA that hybridizes with the tarrgeter-RNA to form a doulbe-stranded RNA duplex of a protein-binding segment, wherein the activator-RNA dn the targeter-RNA are convalently linked to one another with intervening nucleotides, wehrein the single molecule DNA targeting RNA forms a complex with the Cas9 protein, wherey the single molecule NA-targeting RNA targets the target sequence, and the Cas9 prtoein cleaves the target DNA molecule”

In February 2013, Broad Institute researchers published an article describing the use of CRISPR-Cas9 in a human cell line, leading to US Patent No. 8,697,359. Claim 1 of the ‘359 is as follows:

“A method of alterning expression of at least one gene product comprising introducing into a eukaryotic cell containinga nd expressing a DNA molecule haivng a target sequence and encoding the gene product an engineered, non-naturally occurring clusterd regularly interspaced short palindromic repeats (CRISPR)-CRISPR assocaited (Cas) (CRISPR-Cas) system comprising one or more vectors comprising:

a) a first regulatory element operable in a eukaryotic cell operably linked to at least one nucleotide sequence enocding a CRISP-Cas sytem guide RNA that hybridizes with the target sequence, and 

b) a second regulatory element operable in a eukaryotic cell operaly linked to a nucleotide sequence encoding a Type-II Cas9 protein, 

wherein components (a) and (b) are located on same or different vectors of the system, whereby the guide RNA targets the target sequence and the Cas9 protein cleaves the DNA molecule, whereby expression of the at least one gene product is altered; and, wherein the Cas9 protein and the guide RNA do not naturally occur together. “

Applications

CRISPR-engineered T cells/ Adoptive cell therapy):

The infusion of ex vivo engineered T cells, called adoptive T cell therapy, can increase the natural antitumor immune response of the patient. CRISPR-Cas9 gene editing of T cells from patients with advanced refractory cancer has already been shown effective in clinical trials. Removing the endogenous T cell receptor (TCR) (two genes encoding the TCRalpha and TCRbeta were deleted in T cells to reduce TCR mispairing and to enhance the expression of a syntehtic cancer-specific TCR transgene 9NY-ESO-1). and the immune checkpoint molecule programmed cell death protein 1 (PD-1) for example has been shown to improve the funciton and persistence of gengineered T cells. The T cell product was manufactured by electroporation of ribonucleoprotein complexes comprising recomibnant Cas9 loaded with enquimolar mixtures of single guide RNA (sgRNA) for TRAC, TRBC and PDCD1) followed by lentiviral transductino o f the trasngenic TCR. (Stadtmauer (Science 367, 1001 (2020)). 

Citrus Industry Examples:

The most devastating disease affecting the global citrus industry is Huanglonbing (HLB), caused by the pathogen Candidatus Liberibacter asiaticus. HLB is primarily spread by the inset vector Diaphorina citire (Asian Citrus Psyllid). HLB reached the Western Hemisphere by 2004 and by 2013, every grove in Florda was considered infected. To counteract the rapid spread of HLB by D. citri, tranditional cector control stragegies such as insecticide sprays, the release of natural predators and mass introductions of natural parasitoids are used. However, these methods alone have not managed to contain the spread of disease. To frutehr expand the available toods for D. citri control through geenrating specific modificaitons of the D. citir genomes, Akbari developed protocals for CRISPR-Cas9 based genetic modification. However, genome editing in D. citri has been challenging due to the general fragility and size of D. citri eggs. Akbari disclsoes methods for collecting and prparing eggs to introduce the Cas9 ribonucleoprotein (RNP) into early embryos and altenrative methods of inecting RNP into the hemocael of adult fremales for ovarian transduciton. To demonstrate genomic dNA targetting, two genes conserved in several inesct species known to produce visible phenotypic changes in eye color when disrupted, w and kh were hcoses. Mutants for tiehr gene display a reduction of pigmentation in their eyes. The genomic sequences for w and kh were identified by comparing it with the known protein sequences of the Drosophila melanogaster w and kh using tBlastn on the latest version of the D. citiri gehnome. Potential sgRNA target sites marked with a short protospacer adjacent motif (PAM, 5′-NGG-3′) across exons 1 to 6 of the w genes and on exons 7 and 8 of the kH gene were identified. Primers across both genes were then designed. The infal list of sgRNAs contained only those with high targetting efficiency and cleavage rates, resulting in three sgRNAs targeting sties on exons 2, 3 and 6 of the w genes and two sgRNAs targeting two sites on exon 7 and the kh gene. (Akban, GEN Biotechnoloy, 2(4), 2023). 

Transthyretin amyloidosis: also called ATTR amyloidosis, is a life-threatening disease characterized by progressive accumulation of misfolded transthyretin (TTR) protein in tissues, predominantly the nerves and heart. NTLA-2001 is an in vivo gene-editing therapeutic agent that is designed to treat ATTR amyloidosis by reducing the concentration of TTR in serum. It is based on the clustered regularly interspaced short palindromic repeats and associated Cas9 endonuclease (CRISPR-Cas9) system and comprises a lipid nanoparticle encapsulating messenger RNA for Cas9 protein and a single guide RNA targeting TTR. Gilmore (CRISPR-Cas9 In Vivo Gene Editing for Transthyretin Amyloidosis” N Eng J. Medicine, 2021)

Transfusion-dependent β-thalassemia (TDT) and sickle cell disease (SCD): are severe monogenic diseases with severe and potentially life-threatening manifestations. BCL11A is a transcription factor that represses γ-globin expression and fetal hemoglobin in erythroid cells. Frangoul (“CRISPR-Cas9 gene editing for sickle cell disease and beta-Thalassemia” NEngl J Med 2021)  performed electroporation of CD34+ hematopoietic stem and progenitor cells obtained from healthy donors, with CRISPR-Cas9 targeting the BCL11A erythroid-specific enhancer. Approximately 80% of the alleles at this locus were modified, with no evidence of off-target editing. After undergoing myeloablation, two patients — one with TDT and the other with SCD — received autologous CD34+ cells edited with CRISPR-Cas9 targeting the same BCL11A enhancer. More than a year later, both patients had high levels of allelic editing in bone marrow and blood, increases in fetal hemoglobin that were distributed pancellularly, transfusion independence, and (in the patient with SCD) elimination of vaso-occlusive episodes. (Funded by CRISPR Therapeutics and Vertex

Companies: Beam Therapeuticsm Caribou Biociences,  CRISPR Therapeutics, Editas Medicine,  Intellia Therapeutics, Poseida, Prime Medicine, Sangamo Therapeutics, Wave LifeSciences, Verve Therapeutics

Introduction:

CRISPR Cas systems are a main protective mechanism of bacteria against phages that are prevalent in nature. Although the numbers vary across ecosystems, it is estimated that the ratio of bacterial cells to pahge particles is about 1:10 in may natural environments. Thus, an effective protection against phage infection is crucial for bacterial survival. CRISPR Cas systems are considered as the adaptive immune system of bacteria, since they prevent recurring phage infection by specific degradation of the respective phage genome. Upon first infection with a virus, its genome can be degraded and pieces of the genome can then be integrated into the bacterial chromosome in the form of spacers betwen palindromic repeats form the CRISPR arrays. The mature CRISPR RNA (crRNAs; processing product sonisting of one spacer and one repeat sequence) are mandatory to bind to a specific recognition sequence, the ptrotospacer in the phage genome, leading to an activaiton of nucleolytic activity of the Cas-crRNA complex. (Kretz, “Function of the RNA-targeting class 2 type VI CRISPR Cas system of Rhodobacter capsulatus” Frontiers in Microbiology, 2024). 

Bacteria have evolved RNA-mediated adaptive defense systems called clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) that protect organisms from invading viruses and plasmids. CRISP/Cas systems are composed of cas genes organized in operons(s) and CRISPR array(s) consisting of genome-targeting sequences (called spacers) interspersed with identical repeats. Bacteria haboring CRISPR loci respond to viral challenge by integrating short fragments of forein sequence (protospaceers) into the host chromosome at the proximal end of the CRISP array. In the expression and interference phases, transcription of the repeat spacer element into precursor CRISPR RNA (pre-crRNA) molecules followed by enzymatic cleavage yeild the short crRNAs that can pair with complementary protospacer sequences of the invading viral targets. Target recognition by crRNAs directs the silencing of the foreign sequences by means of Cas proteins that function in complex with the crRNAs. One important type of CRISPR/Cas system is the type II system where a trans-activating crRNA (tracrRNA) complementary to the repeat sequences in pre-crRNA triggers processing by the double-stranded (ds) RNA specific reonuclease RNase III in the presence of the Cas9 protein. Cas9 proteins constitute a family of enzymes that require a base-paired structure formed between the activating tracerRNA and the targeting crRNA to cleave target dsDNA. Site-specific cleavage occurs at locations determined by both base-paing complentarity between the crRNA and the target protospaces DNA and a short motif. Cas9 is a DNA endonuclease guided by two RNAs. (Jinek “A programmable dual-FNA-guided NA endonuclease in Adaptive Bacterial Immunition” Science, 337, 2012). 

About 60% of bacteria and 90% of archaea possess CRISPR (clustered regularly interspaced short palendromic repeats)/CRISPR associated (Cas) system to conver resistance to foregin DNA elements. Doudna (US 10,266,850).

In the adaptation phase, immunological memory is cre-ated by inserting the invading nucleic acid fragment as a new spacer into the CRISPR array. In the expression phase, the CRISPR array, containing variable spacer sequences and interspaced by conserved repeat sequences, is transcribed and processed to yield short CRISPR RNAs (crRNAs). These crRNAs specific to the foreign nucleic acid sequence form an effector complex with CRISPR- associated (Cas) proteins that searches the cell for foreign nucleic acids. Many CRISPR-Cas nucleases require a protospacer adjacent motif (PAM), a short sequence adjacent to the spacer target sequence within the target nucleic acid. Upon Cas–crRNA complex target recognition through base pairing, the complex is activated, and the foreign nucleic acid is cleaved. (Balderston, CRISP Journal, 4(3), 2021). 
 
Critical “spacer” sequences inserted between the repeat elements of nautral CRISPR-cas loci in bacteria and archaea are derived form the genomes of previous invaders, usually viruses and plasmids. The CRISPR spacers thus constitute a genomically recorded memory of past cellular introusions. CRISPR-RNAs (crRNAs) transcribed form these repeat-spacer sequences guide Cas proteins to their targets, and the Cas proteins then initial target desctruction,. There are numerous, mechanistically diverse CRISPR0Cas systems but most of them are RNA guided DNA cleaving pathways and those that use a particular Cas protein (Cas9) have revolutionized biomedical research. (Barrangou, human Gene Therapy, 26(7), 2015). 
 
CRISPR-associated genes (cas) form the CRISPR/Cas immune system, which provides adaptive immunity against phages and invasive genetic elements. The immunization process is based on the incorporation of short DNA sequences form virulent phages into the CRISPR locus. Subsequently CRISPR transcripts are processed into small interfering RNAs that guide a multifunctional protein complex to recognize and cleave matching foreign DNA. The sequence on the viral genome that corresponds to a spacer is termed “proto-spacer”. CRISPR loci consist of noncontiguous direct repeat sequences that are separated by unique sequences called spacers. These spacer sequences typically derive for foreign genetic elements such as viruses and plasmids. CRISPR repeats and spacers usually vary between 21-48 and 21-72 bp, respectively. The number of spacers within a particular locus can vary widely and reach several hudred. Although repeat sequences can vary across loci, GTTTg/c and GAAAC sequences are common at the 5′ and 5′ end of the repeat, respectively.  (Barrangou, “CRISPR: New horizons in phage resistance and strain identification”. Annu. Rev. Food Sci, Technol 2012, 3: 143-62). 
 
Classification of CRISPR Systems: 
 
CRISPR-Cas systems have been divded largely into two classes: class 1 systems, including types I, III and Iv that use multi-protein complexes to destroy foreign nucleic acids nad class 2 systems, including types II, V and VI, that use single proteins. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022) 
 
Enzymes within the CRISPR systems are diverse in their mechanisms. Although all known CRISPR-Cas systems share a conserved mechanism for the adaptation phase, they differ in their interference mechanisms. More specifically, CRISPR systems vary by their effector modules used in the interference phase, which can be either protein subunits or protein complexes bound to a crRNA that orchestrates the cleavage of the target nucleic acid molecule. Recent classifications divided CRISPR enzymes into two classes. Class 1 comprises types I, III, and IV, which all have multi-Cas protein effector modules that associate with the crRNA molecule and bind to the target nucleic acid for subsequent cleavage. Class 2 includes types II, V, and VI, where single multisubunit proteins complex with the corresponding crRNA to mediate the interaction with their specific nucleic acid target. (Balderston, CRISP Journal, 4(3), 2021).

Class 1 CRISPR-Cas systems are most common in bacteria and archaea. (Shmakov Nat Rv Microbiol. 2017, 15(3) 169-182).

Although Class 1 CRISPR systems are the most abundant in bacteria and archaea, gene editing applications are limited by the fact that they have multiple-subunit effectors. Of the class 1 systems, type I is the most widely used for gene editing, especially in created long deletions. During type I mediated gene editing, the CRISPR associated complex for antiviral defense (Cascade), composed of multiple subunit effectors and a crRNA, binds target DNA and forms an R loop structure. Then, Cas3 specifically is recruited and cleaves the target DNA. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

Class 2 CRISPR-Case systems: The remaining 10% of CRISPR-cas loci belong to class 2 CRISPR-Cas systems (which use a type II, V or VI effector protein); these sytems are found almost exclusively in bacteria. Class 2 effectors include Cas9, Cpf1, C2e1, C2e2 and C2e3.  Class 2 subtypes can fall into two subclasses; those that cleave the non-target strand of the target dsDNA using a RuvC-like nuclease and those that attack RNA targets using a two HEPN domain RNase (Shmakov Nat Rv Microbiol. 2017, 15(3) 169-182).

-Cas9 Nucleases: see also outline

Cas9 nucleases, which are components of type II CRISPR-Cas systems, are RNA guided DNA endonucleases that induce DSBs at target sites. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

Type II CRISPR system from Streptococcus pyrogenes involves only a single gene encoding the Cas9 protein and two RNAs, a mature CRISPR RNA (crRNA) and a partially complementary trans-acting RNA (tracrRNA) which are necessary and sufficient for RNA guided silencng of foreign DNA). Doudna (US 10,266,850). 

The Streptocococcus pyogenes SF370 type II CRISP locus consists of four genes, including the Cas9 nuclease, as well as two noncoding CRISPR RNAs (crRNAs): trans-activating crRNA (tracrRNA) and a precursor crRNA (pre-crRNA) array containing nucclease guide sequences (spacers) interspaced by identical direct repeats (DRs). Cong (Scinece, 339, 15, 2013)

The type II CRISPR enzyme Cas9, adopted from the immune system of Streptococcus pyogenes (SpCas9), is the most widely used CRISPR enzyme for genome engineering. In the interference phase, Cas9 associates with a crRNA and a short trans-activating CRISPR RNA (tracrRNA) sequence that forms a partial duplex with the crRNA. This com-plex probes double-stranded DNA (dsDNA) when it en- counters a three-nucleotide 5¢-NGG-3¢ PAM, and if there is crRNA/DNA sequence complementary, Cas9 is activated and cleaves the dsDNA between three and seven nucleotides upstream from the PAM. (Balderston, CRISP Journal, 4(3), 2021)

–Cas12 nucleases:

cas12 type V nucleases possess a single RuvC-like domain that cleaves both target and non-target strands and generates staggered ends downstream of PAM sites. Cas12a (foremly known as Cpf1) was the first Cas12 nuclease to be used as a gene editing tool. It only requires a crRNA and can self process pre-crRNA into mature crRNA, which is advantageous for multiplex gene editing. Cas12b, which reqires both a crRNA and tracdrRNA, has acheived gene editing in human cells and plants. Cas12d (formely Casy), Cas 12h and Cas12i have RNA guided DNA interference activity in E. coli. Cas12g is a RNA guided ribonuclease with collateral RNase and single strand DNase activities. Cas12e (formely CasX) and Cas12 have been adopted as gene editing tools in eukaryotic cells and Cas12f (formely Cas14) nucleases have been shown to achieve robust gene modificaiton and regulation in mammalian cells. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

–Cas13 nucleases:

Class 2 type VI CRISPR Cas systems target RNA, and the single effector nucleas belongs to the Cas13 protein family (divided into the four major subgroups Cas133a-d). Compared to other CRISPR Cas systems (e.g., the dsDNA targeting Cas9 systems, class 2 type VI systems are relatively are in bacteria. Kretz, “Function of the RNA-targeting class 2 type VI CRISPR Cas system of Rhodobacter capsulatus” Frontiers in Microbiology, 2024)

Cas13 nucleases belonging to type VI CRISPR Cas systems and contain two HEPN domains are RNA guided ribonucleases. They can process their pre-crRNA into mature crRNA, only require this crRNA to celave target RNA and have collateral activity. (Liu, “The CRISPR-Cas toolbox and gene editing technologies” Molecualr Cell 82, 2022)

DNA Targeting CRISP

CRISPR-Cas Systems provide adaptive immunity in archaea and bacteria. In brief, the CRISPR-Cas response consists of three stages, In stage one adaptation phase, the Cas1-Cas2 protein complex excises a segment of the target DNA (known as the protospacer) and inserts it between the repeats at the 5′ end of a CRISPR array, yielding a new space. In the stage two expression and processing stage, a CRISPR array, together with the spacers, is transcribed into a long transcript known as the pre-CRISPR RNA (pre-crRNA) and is processed by a distinct complex of Cas proteins into mature small CRISPR RNAs (crRNAs). In the third interference stage, a complex of Cas proteins uses the crRNA as a guide to cleave the target DNA or RNA. (Shmakov Nat Rv Microbiol. 2017, 15(3) 169-182). 

DNA targeting CRISP include Cas9 (see outline), Case12a (Cpf1), Cas12b(C2c1) and Cas12e(CasX). 

CRISPR/Cas9:  See outline

RNA-Targeting CRISPR

Although the primary target of the CRISPR/Cas system is DNA in most studied systems, it was shown that osme systems can target RNA. Accordingly, there is potential to leverage CRISPR loci targeting RNA to regulate or silence transcript levels within the cell. Barrangou, “CRISPR: New horizons in phage resistance and strain identification”. Annu. Rev. Food Sci, Technol 2012, 3: 143-62

RNA CRISP include Case13(a) (C2c2), Cas 13b and Cas 13d (CasRx). 

RNA-guided RNA targeting CRISPR-Case factor Cas13a (previously known as C2c2):

Although some Cas enzymes target DNA, single-effector RNA guided ribonucleases (RNases), such as Cas13a can be reprogrammed with CRISPR RNAs to provide a platform for specific RNA sensing. On recognition of the RNA target, activated Cas 13a engages in “collateral” cleavage of nearly non-targeted RNAs. This collateral effect with isothermal amplifcaiton has been used to establish a CRISP-based diagnostic, providing rapid DNA  or RNA detection (Gootenberg, “Nucleic acid detection with CRISPR-Cas13a/C2c2” Science e56, 438-442 (2017).

Cas13a can be engineered for mammalian cell RNA knockdown and binding. (Abudayyeh, Nature, 550, 2017).  

Applications of CRISPR systems:

Most applications of CRISPR systems have focused on the programmable DNA targeting activity of Cas9. The cleavage activity of Cas9 can be harnessed for genome editing, including gene knockout and precide editing through homology-directed repair. (Shmakov Nat Rv Microbiol. 2017, 15(3) 169-182).\

Pathogen detection

Ackerman (Nature, 582, 2020) discloses using a combinatorial arrayed reactions for multiplexed evaluation of nucleic acids (CARMEN) for scalable, multiplexed pathogen detetcion. In CARMEN, nanolitre droplets containing CRISPR-based nucleic acid detection reagents self-organize in a microwell array to pair with droplets of amplified samples, testing each sample against each CRISPR RNA (crRNA) in replicate. The combination of CARMEN and Cas13 detection (CARMEN-Cas13) enables robust testing of more than 4,500 crRNA target pairs on a single array and simultaneously differentiates 169 human associated viruses with at least 10 publsihed genome sequences. 

CRISPR (see outline)

Restriction Enzymes

Restriction enzymes are endonucleases from eubacteria and archaea that recognize a specific DNA sequence, called the restriction site. Usually a restriction site is a palindromic sequence about about 4-6 nucleotides. Most restriction endonucleases cleave the DNA strand unevenly, leaving complementary single stranded ends. These ends can reconnect through hybridization and are referred to as “sticky ends”. Once paired, the phosphodiester bonds of the fragments can be joined by DNA ligase. 

Alpha-Amylases: constitute a class of enzymes syntheized by a variety of organisms from bacterial to fungi to humans that break down large molecules known as polysaccharides. Polysaccharides, such as starch and glycogen, are defined as long chain polymers made up of repeating simple sugar molecules like glucose, among others. Alpha-amylases sever the bonds between adjacent sugars in a polysaccharide to yield single or short chain simple sugars that can provide energy or be used as building blocks for other cellular processes. On average, alpha-amylase enzymes comprise about 500 amino acids.

Beyond a widespread role in natural systems, alpha-amylases also have important commercial applications in detergent formulations, sugar refining, and ethanol production. Of particular note, many alpha-amylasees dervied from bacteria of the genus Bacillus exhibit exceptional enzymatic activity, which has made those bacterial enzymes attractive for commerical usch. One suhc product is a preparation of alpha-amylase derived from B. licheniformis (“BLA”) that is the subject of a famous Court decision concerning written description (see Ypatent Blog and “Novozymes” case).

The glycosylation pattern of immunoglobulins (i.e, the saccharide composition and multitude of attached glycostructures) has a strong influence on the biological properties. Glycosylation of IgG has been shown to be essential for binding to all FcyRs by maintaining an open conformation of the two heavy chains. This absolute requirement of IgG glycosylation for FcyR binding accounts of the inability of deglycosylated IgG antibodies to mediate in vivo triggered inflammatory responses, such as ADCC, phagocytosis and the reelase of inflammatory mediators.

The nature and importance of the conserved Asn297 linked carbohydrate in influencing igG effector functions has long been recognized. Variations in composition of the carbohydrate have been shown to affect the affinity of IgG for three class of FcyR, FcyRI (CD64), FcyRII(CD32), and FcyRIII (CD16) that link IgG antibody mediated immune responses with cellular effector functions. Carbohydrate composition also influences the activity of igG in the classical pathway of complement activation, which is initiated by IgG1 binding to C1q and to mannose binding protections, which structurally resembles C1q. (Shields, J. biological Chemistry, 277(30), 2002)

Effects of Deglycosylated IgGs

It is well described that decllycosylated IgGs are almost completely devoid of all Fc mediated immune effector fucntions as a result of drastically reduced binding to FcyRs or to porteins of the ocmplement system (Ferara, PNAS, 2011, 108(31) 12669-12674).

Effects Sialic acid & galactose:

A low level of galactosylation positively affects complement activation. (Friemoser-Grundshober, (US14/352411))

Nimmerjahn (WO2008/057634) disclose that enrichment of alpha 2,6 linkages between sialic acid and galactose improves anti-inflammatory properties of IVIG fc fragments and removal of attenuates anti-inflammatory properties of IVIG Fc fragments. 

Effects Fucosylation:

Effects on ADCC:  (see also engineered antibodies)

One IgG molecule contains two N-linked oligosacharide sites in the Fc region. The engeral structure of N-linked oligosacharide on IgG is complex-type, characterized by a mannosyl-chitobiose core (Man3GlcNAc2-Asn) with or without bisecting GlcNAc/l-fucose (Fuc) and other cahin variants including the presence or absence of Ga1 and sialic acid. In addition, oligosaccharides may contain zero (G(0) one (G1) or two (G2) Gal. Recent studies have shown that engineering the oligosaccharides of IgGs may hield optimized ADCC. (Shinkawa, J. Biological Chemistry, 278(5), (2003)

ADCC is controlled almost solely by the absence of fucose on IgG1 and galactose and bisecting GLcNAc contribute little or notehring to ADCC (Yaman-Ohunuki ((Biotechnology & Bioengineering, 87(5), 2004).

The lack of core fucose results in higher binding affintiy to FcyRIIIa and thereby enhances ADCC. (Friemoser-Grundshober, (US14/352411))

De-fucosylated by glycosylated Herceptin is at least 50 fold more active in the efficacy of Fcy receptor IIIa mediated ADCC than those with alpha-1,6-linked fucose residues. (Wong, US2011/0263828) teaches the Fc region of an antibody that is specifially glycosylated with oligosaccharides that increase the efficacy and stability of teh Fc region. In one embodiment, the terminal sugar units are sialic acids linked to galactose. 

Lee13 cells, a variant Chinese hamster ovary cell line can be used to produce human IgG1 that are deficient in fucose attached to the Asn297 linked carboyyhydrate. Loack of fucose on the igG1 has no effect on binding to human FcyRi, C1q, or the neonatal Fc rectpror. In contrast, binding of the fucose deficient IgG1 to human FcyRIIIA was improved up to 50 fold. (Shields, J. biological Chemistry, 277(30), 2002)

Removal of core fucose selectively and significantly increases binding affintiy to FcyRIII and leads to enhanced cellular immune effecotr functions, such as ADCC. (Ferara, PNAS, 2011, 108(31) 12669-12674)

Sialic acid: 

The removal (or the absence or reduced levels of sialic acid from the Fc oligosaccharides enhances the avidity of recombinantly produced antibodies for their target molecule (Cai, WO2007/005786)

The carbohydrate structures of all naturally produced antibodies at conserved positions in the heavy chain constant regions vary with isotype. Each isotype possesses a distinct array of N-linked oligosaccharide structures, which variably affect protein assembly, secretion or functional activity. 

See also Methods used to separate antibody glycovariants See also Mass Spectrometry (MALDI TOF MS)

Detection and Characterization of product associated variants during the product of polypeptides (e.g., antibodies) is important because profiling of impurities is a regulatory requirement of the FDA. Product assocaited variants include not only truncated or elongated peptides but also peptides have different glycosylation than the desired glycosylation. Proudct assocaited variants may exhibit alterations in one or mroe of molecular mass (detected by size exclusion chromatgoraph), isoelectric point (.e.g, detected by isoelectric focusing), electrophoretic mobility (e.g., detected by gel electrophoresis), phosphorylation state (e.g., detected by mass spectrometry), charge to mass ration (e.g., detected by mass spectrometry), mass or identity of proteolytic fragments (e.g., detected by mass spectrometry or gell electrophoresis), hydrophobicity (e.g., detected by HPLC), charge (e.g., detected by ion exchange chromatgaphy), affinity (e.g., in the case of an antibody, detected by binding to protein A, protein G and/or an peitope to which the desired antibody binds) and glycosylation state (e.g., detected by lectin binding affinity). (Allison, US 14.215370).

Chromatography

Affinity chromatography

–Lectin Affinity Chromatography:

Allison (US 14/215370) discloses culturing yeast cells to epxress a desired recombinant polypeptide, periodically obtaining one or more samples of the fermentation medium and detecting the amount and/or type of glycosylated impurities in the sample using a lectin that binds to the glycosylated impurities (e.g., ConA, LCH, GNA, DC-SIGN) and based on the amount of detected glycosylated impurities modifying one or more of the oeprating paraters or conditions of the fermentation and finally pooling different samples, eluates or fractions containing the desired recombinant antibody polypeptide from the same or different fermenation processes.  In one embodiment, the lectin is conjugated to a probe such as biotein, alkaline phosphatase or horseradish peroxidase and then immobilized to a support such as avidin. Standard protein-protein interaction monitoring process such as (e.g., surface plasmon resonance, ELISA dynamic light scattering) can be used to analyze the interaction between lectin and glycosylation impurities in samples from various steps of the purificaiton process.

HPLC: 

Normal pahse HPLC is an established technique for separating mixtures of oligosaccharides. The earliest methdos sued columns with diethylaminoethyl functional groups to separate neutral and acidic oligosaccharides. More recently, colums with imide and amide functional groups ahve been used. Oligosaccharides applied to a silica matrix with amide functional groups have been successively and predictably eluted with acetonitrile water gradients buffered with volatile sales. These solvent systems exploit the subtle differences in hydrophilicity between individual sugars and thus their affinity for the column matrix produces high resolution. (Guile, Analytical Biochemistry 210-220, 1996).

Mass Spectrometry

Electrospray-ionization mass spectrometry (ESI-MS): is a well established tool for biotherapeutic analysis. It draws intact proteins or peptide ions into the vacuum of a mass spectrometery, wehre the ion mass is measured. ESI-IMS introduces ions into a low pressure gas, wehre the effects of aerodynamic drag reveal their shape. One promising application for ESI-IMS is to generate biphysical “fingerprints” for comparing biosimilars with innovator drugs. 

MALDI TOF MS

Structural characterization of individual oligosaccharides has been proven to be much more challenging than for other biopolymers due to its branched and isomeric nature. Techniques based on mass spectrometry (MS) have been used extensively and successfuly for the characterization of glycans. Matrix-assisted laster desorption/ionization time-of-flight (MS) (MALDI TOF MS) has been the tool of choice because it generally produces singly charged ions, yielding simple spectra and making interpretation straightforward. (Qian, Anal. Biochem. 364 (2007) 8-18).

Definitions

Glycan: refers to the carbohydrate portion of a glycoconjugate, such as a glycopeptide, glycoprotein, glycolipid or proteoglycan. Regnier (US8,568,993) 

N-glycans

Peptides expressed in eukaryotic cells are typically N-glycosylated on asparagine residues at sites in the peptide primary structure containing the sequence asparagine-X-serine/threonine where X can be any amino acid except proline and aspartic acid. The carbohydrate portiion of such peptides is known as an N-lniked glycan. 

N-glycans attached to glycoproteins differ with respect to the number of branches (antennae) comprising peripheral sugars (e.g., G1cNAc, galactose, fucose, and sialic acid) that are added to a G1cNac.2Man3 core structure. (Hiatt (US2013/0149300)

–biantennary N-glycans: refers to a complex oligosaccharide where the core comprises two branch terminal N-acetylglucosamine (G1cNAc), 3 mannose (man) and 2 (G1cNAc) monosaccharide residues that are attached to the asparagine residue of the glycoprotein.

O-linked glycans:

Peptides may also be modified by addition of O-linked glycans, also called “mucin-type glycans” gecasue of their prevalence on micinous glycopeptide. Unlike N-glycans that are linked to asparagine residues and are formed by en bloc transfer of oligossaccharide from lipid-bound intermediates, o-glycans are linked primarily to serine and threonine residues and are formed by the stepwise addition of guars from nucleotide sugars. (Defrees (US2008/0206808).

Sialic Acids

Sialic acid: is a generic term for the N- or O-substituted derivatives of neuraminic acid, a monosaccharide with a 9 carbon backbone. It is also the name for the most common member of this group, N-acetylneuraminic acid (Neu5Ac). Sialic acids occur at the end of sugar chains connected to the surfaces of cells or soluble proteins. 

Sialic acid on glycans are known to be important in prolonging the serum half-life of glycoproteins other than antibodies. So far the fole of sialic acid on monoclonal antibodies is not well understood (Cai, WO2007/005786). 

Sialic acids (Sias) are nine carbon backbone monosaccharides with a carboxylic function in the C-1 position and are usually present at the outermost (non-reducing) end of glycan chains in the deuterostome lineage of animals. More than 20 sialyltransferases have been described which attach different Sias onto various acceptor structures in highly specific linkages. The predominant sialic acids found on mammalian cell surfaces are Neu5Ac and Neu5Gc. Being positioned at the outermost end of glycan chains, Sias represent the receptors most frequently targeted by pathogens that use Sia binding proteins to access hot cells and specificity towards either Neu5Ac or Neu5Gc may be observed. They are also known to serve as ligands for intrinsic sialic acid binding proteins such as Siglecs. Some Siglecs even discriminate between Neu5Ac and Neu5Gc even though the two Sias differ only by the presence or absence of a single oxygen atoms. Whereas most mammals express Nue5G containing glycans on their cell surfaces, glycans on human cells primarily express the precursor molecule Neu5Ac. Indeed, humans generate immune responses against molecules carrying Neu5Gc (e.g., the “serum sickness” reaction to equine anti-thymocyte globuilin therapy). The loss of Neu5Gc occured about 3 million years ago and represented the first known genetic difference between huamns and chimpanzees that could be direclty linked to an altered phenotype. Despite the human specific loss of Neu5Gc de novo biosynthesis, Neu5Gc has been detected in several malignant human tumors. Ghaderi, Biotechnology & Genetic Engineering Reviews, 28, 147-176 (2012). 

Protein sialylation is an ezymatic process, and is the terminal reaction of glycosylation that produces matured sialylated oligosaccharides on glycoproteins. In rough endoplasmic reticulum (ER), high mannose core is added to newly synthesized protein. The protein is then transported to the GA. There are at least 18 different intracellular, Golgi membrane bound glycosyltransferases which catalyze the reaction for growing oligosaccharide chains by using nucleotide sugar precursors as substrates. For instand, sialytranferase ST3GAL4 (ST3 beta-galactoside alpha-2,3 sialytransferase 4) uses CMP sialic acid as a substrate and adds alpha-2,3 linked sialic acid to beta1,4 Galactose. (Xu, Mol Biotechno (2010) 45: 248-256)

N-acetyl neuraminic acid (Neu5Ac, NeuAc, or NANA)): is the most common Sia which serves as a biosynthetic precursor for most other Sias. The N-acetyl group of NeuAc is hydroxylated. This form is prevalent in glycoproteins from rodent and microbial sources. (Cai, WO2007/005786)

N-glycolyl-neuraminic aicd (Neu5Gc, NGNA or NeuGc): biosynthesis of Neu5Gc occurs exclusively by hydroxylation of the N-1cetyl group of CMP-Neu5Ac to yeild CMP-Neu5Gc. 

All humans have a unique inactivating homzygoud mutation in CMP-Neu5Ac hydroxylase (CMAH), eliminating the enzymatic activty that generates CMP-Neu5Gc. The CMAH mutation was caused by an Alu replacement even about 2.5 millions years ago and resulted in the absence of Neu5Gc and a secondary increased level of Neu5Ac on human cell surfaces (Sonnenburg, Glycobiology, 14(4), 339-346, 2004).

Definitions:

Isomer: of a compound is a separate compound in which each molecule contains the same constituent atoms as the first compound, but with those atoms arranged differently. Thus isomers are different compounds having identical chemical formula.

Glucose, fructose and galactose are isomers with empirical formul (C6H12O6). A structural isomer of glucose, such as fructose, has identical chemical groups bonded to different carbon atoms. A sterioisomer of glucose such as galactose, has identical chemical groups bonded to the same carbon atoms but in different orientations. 

Enzymes that act on different sugars can distinguish both the structural and steroioisomers of this basic 6 carbon skeleton. 

–Structural Isomers: If there are differences in the actual structure of carbon skeleton of for example organic molecules, the molecuels are called structural isomers. Glucose and fructose, for example, are structural isomers of C6H12O6. Fructose is a structural isomer that differs in the position of the carbonyl carbon C≈O. Your taste buds can differentiate fruactose which tasts much sweeter than glucose, despite the fact that both sugars have identical chemical compositions. 

–Stereoisomer: is an isomer in which the same atoms are bonded to the same other atoms, but where the configuration of those atoms in three dimensions differs. For example, a dashed triangle leading from a marked carbon to a H atom indicates that the H lies below the planes of the two five-sided rings of which the carbon atom might be a part. If the H atom lies above the planes of the rings, then the resulting structure is a stereoisomer.

Thus steroisomers have the same carbon skeleton but difffer in how the groups attached to this skeleton are arranged in space. Enzymes in biologcial systems usually recognize only a single, specific steroisomer. 

—-Enantiomers: are stereoisomers (spatial isomers) wherein the isomeric compounds have the same chemical formula and the same chemical structure, but differ in their orientation in three-dimensional space. Such stereoisomers can exist for all molecules that contain an asymmetric carbon atom. An “asymmetric carbon” is a C atom to which four different substituents are attached, whereby, due to the tetrahedral structure of C bonds in 3 dimensions, the spatial orientation of substituents attached to a C atom varies. When there is only one asymmetric C atom in the molecule and thus only 2 stereoisomers, these isomers are called enantiomers.

Enantiomers are thus mirror images of each other. A molecule that has mirror image versions is called a chiral molecule. The two molecules have the same groups but cannot be usperimposed much like your two hands. 

Enantiomers are stereoisomers that are nonsuperimposable mirror images of each other like left and right hands. Enantiomers are identified and distinguished by their optical characteristics when a purified solution of the separated isomers is exposed to plane-polarized light. Enantiomers accordingly exhibit different optical activity; the enantiomer that roates a plan of polarized light in the clockwise direction is the (+)- enantiomer (also called the destrorotaory of d- isomer); the enantiomer that rotates a plane of polarized light in the counterclockwise direction is the (-)-enantiomer (also called the levorotatory or l- isomer).

Enantiomers may also be designated as the S-enantiomer and the R-enantiomer according to a different criterion relating to the location of the chiral centers. “Chiral” is defined as describing asymmetric molecules that are mirror immages of each other (i.e., like right and left hands).

Although enantiomers h ave nearly identical physical properties, they often have very different biological activities. This is due to the chirality of biological molecules, such as proteins, and the resulting affect on, for example, enzyme active sites. Whereas an enzyme may recognize and catalyze a reaction with one enatiomer due to physical and chemcial complementarity with the enzyme’s active site, the same enzyme may not recognize the other enantiomer, due to noncomplementarity.

In vivo, one enantiomer may be converted to the other enantiomer through racemization which is a process whereby a compound consisting of a single enantiomer is converted to a one to one mixture of that enantiomer and its opposit (i.e., the racemate) by the cleavage and reformation of a chemcial bond at the chiral center of the molecule. Racemates are mixtures of equal amounts of enantiomers and are denoted as (d,l) or (+/-_ pairs for the steroisomerism of a given chiral center. Since there are 2 enantiomers in a racemate of a compound with a single chiral center, a racemate potentially compirses a very small genus of two species. As the number of chiral centers increases, however, the number of species within a racemate also increase. This increase would be 2n where n is the number of chiral centers.

Diastereomers: are stereoisomers that are not enantiomers.

Definitions:

Carbon is a versatile molecule because an atom of carbon can share electrons with other atoms in four covalent bonds that can branch off in four directions. Because C can use one or more of its bonds to attach to other carbons atoms, it is possible to construct an endless diversity of C skeletons varying in size and branching pattern. Thus, molcules wiht multiple C intersections can form very elaborate shapes. The C atoms of organic compounds can also bond with other elements, mostly hydrogen, oxygen and nitrogen.

Carbohydrates: are a loosely defined group of molecuels that all contain carbon, hydrogen, and oxysten in the molar ratio 1:2:1. The empirical formula (which lists the number of atoms in the molecuels with subscripts) is (CH2O)n, where n is the number of C atoms. Because they contain many Carbon-hydrogen (C-H) bonds, which release energy when they are rearranged, carbohydrates are well suited for energy storage. Sugars are among the most important energy storage molecules and they exist in several different forms. 

The simplest of the carbohydrates are the monosaccharides. Simple sugarrs contain as few as 3 carbon atoms, but those that play the central role in energy storage have 6. The empirical formula of 6 carbon sugars is C6H12O2 or (CH2O)6. The most important 6 carbon sugar used for energy storage is glucose

Hydrocarbons: are molecules that consit only of carbon and hydrogen. Because the oxidation of hydrocarbons compounds results in a net reelase of energy, hydrocarbons make good fuels. Gasloine, for example, is rich in hydrocarbons and propane (C3H8) gas, another hydrocarbon, consists of a chain of three C atoms with 8 hydrogen atoms bound to it. C and H atoms both have very similar electronegativities.

Disaccharide: Most organisms transport sugars within their bodies. In humans, the glucose that circualtes in the blood does so as a simple monosacharide. In plants and many other organisms, however, glucose is converted into a transport form before it is moved. In such a form, it is less readily metabolied during transport. Transoport forms are sugars are commonly made by linking tow monosaccharides together to form a disacharide. Disaccharides serve as effective reserovirs of glucose becasue the enzymes that normally use glucose in the organisms cannot break the bond linking the two monosaccharide sbunits. Enzymes that can do so are typically present only in teh tissue that uses glucose. 

Glucose: is the most important 6 carbon sugar used for energy storage. Glucose has 7 enery storing C-H bonds. Depending on the orientation of the carobnly group group when the ring is clsoed, glucose can exist in two different forms: alpha or beta. This is significant when glucose is used as a monomer to form polymers. 

Lactose: When glutose is linked to the steroisomer galactose, the resulting disacharide is lactose, or milk sugar. Many mammals supply energy to theri young in the form of lactose. Adults often ahve greatly reduced levels of lactase, the enzyme required to cleave lactose into its two monosaccharide components, and thus they cannot metaolize lactose efficiently. This can result in lactose intolerance in numans. 

Lipids: have a very high proportion o nonpolar carbon-hydrogen bonds. Long chain lipids cannot fold up like a protein to confine theri nonpolar portions away from the surrounding acqueous enviornment. Intead, may lipid molecuels cluster together and expose what polar (hydrophoilic) groups they have to the surrounding water while confining the nonpolar (hydrophobic) parts together within the cluster. Many lipids are build from fatty acids and glycerol. Many lipd molecuels consist of a glycerol molecule with three fatty acids attached, one to each carbon of the glycerol backbone. Becasue it contains three fatty acids, a fat molecule is commonly called a triglyceride. Organisms contain many other kinds of lipids besdies fats. Terpenes are long chain lipids that are components of many biologically important pigments such a chlorophyll and the visual pigment retinal. Rubber is also a terpene. Steroids, another class of lipid, are composed of four carbon rings. Most animal cell membranes contain the steroid cholesterol. Other steroids such a testosterone and estrogen, function as hormones in multicellular animals. Prostaglandins are a group o about 20 lipids taht are modified fatty acids, with two nonpolar toails attached toa 5 carbon ring. Prostaglandins act as local chemical messengers in many vertebrate tissues. 

Phospholipids: are complex lipid molecules which form the core of all biological membranes. Individual phospholipids can be thought of a a substituted triglyceride with a phosphate replacing one of the fatty acids. Thus the basic strcutre of a phospholipid includes 1) glycerol, 2) fatty ais and a 3) phosphate group.

Fats: consit of fatty acid polymers attached to glycerol. Fats are excellent energy storage molecules. Most fats contain over 40 carbon atoms. The ratio of C-H bonds in fats is more than twice that o carbohydrates. This means that fats are relatively more reduced than carbohydrates, and will thus release more energy upon oxidation. This makes fats much mroe efficient molecuels for storing chemical energy. Most fats produced by anims are saturated (except some fish oils) whereas most plant fats are unsaturated. 

–Fatty acids are long chain hydrocarbons with a carboxyl group (COOH) at one end. If all of the internal carbon atoms in a fatty acid are bonded to two hydrogen atoms, the fatty is is saturated with the maxim number of H atoms possible. A faty acid with double bonds between one or mroe pairs of sussive carbon atoms will have fewer hydrogen atms, and is thus unsaturated. The presence o a double bond in fatty acids makes unsaturated fats liquid at room temperature Ooils) because of the lack of rotation around the double bonds which produces a kink in the chain whereas saturated fats (animal fat and butter) are solid at room temperature. . 

–Glycerol: is a 3 carbon polyalcohol (three -OH groups). 

Sucrose: When glucose forms a disaccharide with teh structural isomer fructose, the resulting disaccharide is sucrose, or table sugar. Sucrose is the form most plants use to transport glucose and is the sugar that most humans and other animals eat. Sugarcane and sugar beets are rich in sucrose. 

Non-polar: Electons in C-C and C-H bond thus are evenly distributed, with no significant differences in charge over the molecular surface. For this reason, hydrocarbons are nonpolar. 

Polar: Because other atoms besides H and C frequently have different electronegativities, molecuels containing them exhibit regions of partial positive or negative charge. They are polar. 

Functional groups: include groups like the hydroxyl group (-OH). Functional groups have definite chemical properties that they retain no matter where they occur. Both the hydroxyl and carbonyl group, for example, are polar because of the electronegativity of the oxygen atoms. Other common functional groups are the acidic carboxyl (COOH), the phosphate (PO4-) and the basic amino (NH2) groups. Many of these funcitonal groups can also participate in hydrogen bonding. 

Types of Hydrocarbons

Hydrocarbons are organic compounds which consist entirely of H and C. Typically, substituted chemical moieties include one or more substituents that replace hydrogen. Examples include halo, alkyl, cycloalkyl, aralkyl, aryl, sulfhydryl, hydroxyl (-OH), alkoxyl, cyano (-CN), carboxyl (-COOH), and the like. See US 2010/0075375 A1 for definitions. 

Alkyl: refers to a saturated straight, branched, or cyclic hydrocarbon having from 1-22 C atoms. Alkyl groups include methyl, ethyl, n-propyl, isopropyl, no-butyl, isobutyl, t-butyl, n-pentyl, cyclopentyl, isopentyl, neopentyl, etc.

Alkane: are carbon chains held together by single bonds. 

–Methane (CH4): is the simplest organic compound.It is abundant in natural gas and is also produced by prokarytoes that live in swamps and in the digestive tracts of grazing animals.

–Octane: are the main molecules in gasoline.

Aryl: refers to an optionally substituted, mono or bicyclic aromatic ring system having from about 5-14 carbon atoms. Examples include phenyl.

Aralkyl: reefers to alkyl radicals bearing an aryl substitutent and have from about 6- 22 C atoms. Aralkyl groups can be optionally substituted. Examples include benzyl, naphthylmethy, diphenylmethyl, triphenylmethyl…

Alkoxy/Alkoxyl: refer to an optionally substiuted alkyl-O- group. Examples include methoxy, ethoxy, n-propoxy, i-propoxy, n-butoxy, etc.

Functional Groups

The unique properties of an organic compound depend not only on its carbon skeleton but also on the atoms attached to the skeleton. These groups which are directly involved in chemical reactions are called “functional groups”. 

-Hydroxyl group (OH): found in alcohols such as isopropyl rubbing alcohol.

Carboxyl group (-COOH): is found in all proteins.

Aroyl: refers to a -C(=O)-aryl group.

Carboxy: refers to a -C(=O)(OH) group

Polysaccharides: 

Polysaccharides are longer polymers made up of monosacharides that have been joined through dehydration reactions. 

Starch: is a storage polyacharide that consists entirely of alpha-glucose molecules linked in long chains. Thring due to bonds between the C1 of one molecule and the C-6 of anohter (alpha (1-6) linkage). 

Amylose: is a starch with the simplest structure is amylose which is composed of many hundreds of alpha-glucose molecules linked together in long, unbranched chains. Each linkage occurs between the carbon 1 of one glucose molecule and the C-4 of another, making them alpha (1-4) linkages. Most plant starch, including the remaining 80% of potato starch, is a somewhat more complicated variant of amylose called amylopectin. Pectins are branched polysaccharides with the brances occcurring due to bonds between the C1 of one molecule and the C6 of another. These short amylose branches consist of 20-30 glucose subunits.

Glucogen: is an insoluble polysaccharide containing branched amylose chains. Glycogen has a much longer average chain lenght and mroe branches than plant starch. 

Cellulose: is a polymer of beta-glucose. The properties of a chain of glucose molecules consisting of all beta-glucose (as apposed to alpha-glucose) are very different form those of starch. These long, unbranched beta-linked cahins make tough fibers. Cellulose is the main component of plant cell walls. It is chemically similar to amylose but enzyems that ocur in most organisms cannot break the bond formed between two beta-glucose units becasue they only recognize alpha linkages. Becasue cellulose cannot be broken down readily by most animals, it works well as a biological structural materail. But some animals such as cows are able to utilize cellulose aided by a symbiotic bacteria and protiests in their digestive tracts. These organisms provide the necessary enzyme to celave the beta (1-4) linkages, releasing glucose for further metabolism. 

Chitin: is a polymer of N-acetylglucosamine, a nitrogen-containg derivative of glucose. When cross-linked by proteins, it forms a tough, resistant surface material that serves as the hard exoskeleton of insects and crutaceans. It is also the structural material found in arthropodes (e.g., lobsters) and many fungi. 

Send an Email. All fields with an * are required.