Design of mini Cas9 proteins using computational tools




Artiles, Maria
Jayasinghe Arachchige, Vindi
Liu, Jin


0000-0001-7125-5113 (Artiles, Maria)

Journal Title

Journal ISSN

Volume Title



Purpose: Adeno-associated viral (AAV) vectors are routinely used for the delivery of CRISPR-Cas systems. These vectors can only package molecules up to ~4.7 kilobase pair (kbp) in size. The most widely used Cas protein is Streptococcus pyogenes Cas9 (SpCas9), which is comprised of 1368 amino acid (aa) residues and is 4.3 kbp in size. Therefore, delivery of the CRISPR-Cas9 and the guide RNA (gRNA) requires the use of two separate vectors, which decreases the overall effectiveness of the system. In this study we used computational tools to facilitate the design of mini Neisseria Meningitidis Cas9 (Nme1Cas9) nucleases, of 900 aa in length or less, that can be packed with its associated guide gRNA in a single vector. Nme1Cas9 is a promising system, given that in its wild-type conformation, it already is 286 aa residues smaller than the widely used SpCas9[LJ1] ; and it has shown promising effectiveness in mammalian cells. Additionally, Nme1Cas9 also has a longer spacer derived guide sequence than other orthologs, and a longer protospacer adjacent motif (PAM[LJ2] )consensus, which reduces the propensity to off-target effects. For these reasons, Nme1Cas9 provides an ideal starting point for the development of engineered mini Cas9 protein, allowing us to exploit its natural features and optimize them with the use of computational tools including artificial intelligence (AI) and machine learning.

Methods: We used The Protein Data Bank (PDB) database, UniProt database, and ChEMBL database to obtain the sequences and crystal structures of Cas orthologs and their associated gRNA and DNA sequences. Next, we identified the known DNA and RNA interacting residues of Nme1Cas9 from the available literature. Sequence alignments were performed with CLUSTAL OMEGA. Structural visualizations and reductions were performed with ChimeraX and Yasara software. AlphaFold2 was used for 3D structure prediction and molecular dynamics (MD) simulations were used to determine the stability of designed proteins.

Results/Conclusions: We generated a library of mini Nme1Cas9 sequences that are less than 900 aa in length. The AI based modeling studies using the Alphafold2 have shown similar folding of these mini Cas proteins compared to their original counterparts. MD simulations confirm their stability in the presence of DNA and gRNA. Further validation of the designed proteins and their experimental testing is under investigation at this point of the study.

Keywords: CRISPR-Cas9, Cas9, mini Cas9, Cas9 orthologs, AI.