De novo protein design requires designing a primary structure that leads to the desired tertiary level of structure. Understanding how primary structure leads to secondary and tertiary structure begins by deconstructing existing proteins. The first step is to decode the primary structure through protein sequencing.
Short peptides of no more than about 50 amino acids can be sequenced using a technique called the Edman degradation. Longer polypeptides can be sequenced through mass spectrometry; John Bennett Fenn won the 2002 Nobel Prize in chemistry for developing a good method using this technique. Amino acid sequence can also be predicted, rather than determined empirically, by sequencing and decoding the DNA or RNA that encodes it in the genome.
A real protein molecule may contain thousands of amino acids, all of which interact to result in three or four levels of structure. Real proteins are so complex that they cannot be modeled "as-is," atom by atom, with current computing technology (though DNA computers may one day make this more feasible). Instead, they are simplified in two ways. First, instead of representing every atom, the simulation represents each amino acid as a whole, like a bead with specific properties. Second, the beads are modeled in a rigid cubical lattice, from which comes the name of these representations: lattice proteins.
About a decade ago, this technology allowed de novo protein design to become a reality. "De novo" is a Latin phrase literally meaning "from the new," and de novo protein design is the engineering of proteins from scratch. The first de novo designed proteins were reported near the end of the 20th century (for example, Dahiyat & Mayo 1997).