Return to Teaching

2010 – Computational processes in living cells

Spring term 2010



Viewed as information processing systems, biological organisms possess amazing capabilities to perform information-handling tasks such as: control, pattern recognition, adaptability, information-storage, etc. Thus, the functioning of biological organisms as information-processing systems is of great interest to computer scientists, and we are witnessing now a fast growing research in this field. This research is genuinely interdisciplinary in nature, involving both computer scientists and molecular scientists (biologists, biochemists, biophysicists, crystallographers, …). One of the leading paradigms of this research is “cell as a computer”. A beautiful example of this paradigm is a single cell organism called ciliate – the gene assembly process in ciliates has turned out to be a very elegant computational process which even uses one of the basic data structures of computer science: the linked lists! This process of gene assembly (assembling genes of macronucleus from their micronuclear form) is the most involved DNA processing known in living organisms.

This course presents the computational aspects of gene assembly in ciliates. We review briefly the basic biology of ciliates and especially the gene assembly process. We then investigate several computational models to reason about gene assembly and to study its computational properties, including: assembly strategies, parallelism and concurrency in gene assembly, computing with ciliates, etc. 

In particular we will discuss:

  1. the benefits of this research for computer scientists – novel and challenging problems and paradigms which lead to broader and deeper understanding of the notion of computation
  2. the benefits of this research for biologists – novel and uniform explanation of experimental results, new insights into the gene assembly process, new suggestions for experiments.


  • Introduction to the biology of ciliates
  • The gene assembly process in ciliates
  • Molecular models for gene assembly
  • Modeling genes as permutations, strings, and graphs
  • Computational assembly strategies
  • Nondeterminism and invariants of the gene assembly process
  • Complexity measures for ciliates genes
  • Parallelism in gene assembly
  • SSimple operations
  • Computing through gene assembly

Literature: The course book is Ehrenfeucht, Harju, Petre, Prescott, Rozenberg – “Computation in living cell: Gene assembly in ciliates”.

  • Tero Harju, Ion Petre, Grzegorz Rozenberg – “Gene assembly in ciliates: Molecular operations”, Bulletin of EATCS, 881, 236 – 249, 2003.
  • Tero Harju, Ion Petre, Grzegorz Rozenberg – “Gene assembly in ciliates: Formal frameworks”, Bulletin of EATCS, 82, 227 – 241, 2004.
  • Andrzej Ehrenfeucht, Tero Harju, Ion Petre, David M. Prescott, Grzegorz Rozenberg – “Formal systems for gene assembly in ciliates”, Theoretical Computer Science, 292, 199-219, 2003.
  • Andrzej Ehrenfeucht, Ion Petre, David M. Prescott, Grzegorz Rozenberg – “String and graph reduction systems for gene assembly in ciliates”, Mathematical Structures in Computer Science, vol.12, 113-134, 2002.
  • Andrzej Ehrenfeucht, Tero Harju, Ion Petre, Grzegorz Rozenberg – “Characterizing the micronuclear gene patterns in ciliates”, Theory of Computing Systems 35, 501-519, 2002.
  • Tero Harju, Ion Petre, Grzegorz Rozenberg – “Two models for gene assembly”, in J.Karhumaki,G.Paun, G.Rozenberg, (eds.), Theory is Forever, LNCS 3113, 89-101, Springer, 2004.
  • Tero Harju, Chang Li, Ion Petre, Grzegorz Rozenberg – “Parallelism in gene assemby”, Preproceedings of the 10th International Meeting on DNA-based computers DNA 10, Milan, Italy, 74 – 83, 2004, to appear in Springer Lecture Notes in Computer Science, 2005.

Credits: 5 study points.

Components: 28h lectures, final examination. 

Time schedule: The course runs in the 4th term of the 2009-2010 academic year: March 16-May 6. 

  • The lectures are given every Tuesday and Thursday 13-15, ICT, room B3040 (Cobol).
  • First course exam: week 19 (May 10-16)
  • Second course exam: week 21 (May 24-30)

Prerequisites: No background on Biology is required, as this will be provided throughout the course, when needed. Some notions of discrete mathematics, especially permutations, strings, and graphs may be useful..

Lecturers:   Ion Petre (ion.petre at, Eugen Czeizler (eugen.czeizler at, Vladimir Rogojin (vladimir.rogojin at, Department of IT, Åbo Akademi University


  1. Introduction
  2. Biology of Ciliates
  3. Gene structure. The intramolecular model for gene assembly
  4. The intermolecular model. Template-based recombination
  5. Model forming
  6. Genes as MDS descriptors
  7. Genes assembly as a rewriting of MDS descriptors
  8. Genes as signed strings, gene assembly as string rewriting
  9. Genes as signed graphs, gene assembly as graph rewriting
  10. Equivalence of the three model formulations
  11. Nondeterminism and invariants of gene assembly
  12. Complexity measures for genes and gene assembly. Simple and elementary gene assembly
  13. Parallelism in gene assembly
  14. Computing through gene assembly

Last updated: April 28, 2010