Module nimna

Module nimna

This module provides all ViennaRNA-wrapper functionality available in nimna.

General usage

This section describes the general usage of the nimna package. A large part of nimna is a high-level Nim wrapper of the ViennaRNA 2.x (Lorenz et al. 2011) object-oriented API. As such, it provides all the functionality available in ViennaRNA behind an idiomatic Nim interface, while taking care of memory management using the Nim GC. It also provides nucleic acid design functionalities (currently based on an artificial immune system algorithm with naive sequence sampling) using the design submodule.

Installation

Installation of nimna and its dependencies is automatically handled using nimble install. This tries to autodetect, whether a suitable dynamic library version of libRNA exists on your computer. If that is not the case, it pulls all files necessary for installation from the web and installs them. In case of Windows, the ViennaRNA installer is started and you may need to complete the some installation steps manually, while for *nix ViennaRNA is built from source. As such, it suffices for most use-cases to just perform:

$ nimble install nimna

As nimble currently does not allow uninstall hooks, nimna provides a command to clean up dependencies, should you want to uninstall nimna:

$ nimnacleanup

This removes the dependency folder in the nimna package. If this is not done, nimble cannot fully remove nimna's package directory, leaving the deps folder to lie around.

Nucleic acid design

Beyond ViennaRNA functionality, nimna currently provides a module for general nucleic acid design, which currently implements an artificial immune system algorithm à la CLONALG (Castro and Von Zuben 2002) coupled with user defined fitness functions ("affinity" in the lingo of the CLONALG paper), and naive random resampling of nucleic acid sequences. Using this module, the user can design sequences conforming to arbitrary sequence constraints in the IUPAC notation, as well as arbitrary design targets, solely specified by a used-defined fitness function.

Submodules and their functionality

The functionality of nimna is split amongst multiple modules which are not meant to be imported separately, but are exported from nimna.nim. Those modules logically separate the functionality contained in nimna.

RNA

The RNA module provides a bare-bones wrapper of libRNA functionality generated using c2nim. All other modules (excluding design) provide higher-level conveniences around this. Should you for some reason need to access libRNA functionality directly, you can import this module, though for most use-cases nimna should suffice.

nimna_types

The module nimna_types provides all high-level wrapper types used in nimna, together with all the needed constants. It is included by every other submodule of nimna.

nimna_cutils

The module nimna_cutils provides a small set of utility templates for interfacing with libRNA.

nimna_compound

The module nimna_compound provides access to one of the core types in nimna - the Compound. A Compound object contains a sequence and all context required to fold that sequence, evaluate its energy, and all other things one can do with sequences in nimna. The module allows for the easy construction of Compounds from plain strings using the compound family of procs, as well as the dimer proc for multiple sequence folding. It also provides access to all fields of the underlying vrna_fold_compound_t via the . and .= macros, should access to compound internals be required.

nimna_model

The module nimna_model contains procedures to set parameters for folding, such as the temperature at which a compound should be folded. It also contains the update macro, which allows to easily update any number of parameters on a compound.

nimna_constraints

The nimna_constraints module provides facilities to constrain nucleic acid folding. This can be done by either forcing certain bases to pair or remain unpaired (with forcePaired, forceUnpaired, constrain, i.e hard constraints), or setting a free energy bonus for certain secondary structures (with preferPaired, preferUnpaired, i.e. soft constraints) or motifs (with preferMotif).

nimna_alignment

The module nimna_alignment provides convenience for using the Alignment data structure used to represent multiple sequence alignments in alignment folding.

nimna_fold

The module nimna_fold contains all the folding functionality on Compounds available in ViennaRNA. This includes minimum free energy folding (mfe), partition function folding (pf) and centroid folding (centroid).

nimna_eval

The module nimna_eval includes procedures for evaluating the free energy of nucleic acid sequences folded into particular secondary structures (eval), as well as the energy differences induced by adding or removing a single base-pair. This is useful for defining fitness functions for nucleic acid design.

nimna_probabilities

The module nimna_probabilities provides the facilities for dealing with Probabilities objects, which allow to inspect the base-pairing probabilities computed in partition function folding (obtained using prob).

nimna_sample

Samples secondary structures from a Compound, according to the Boltzmann-distribution of the Compound's ensemble.

nimna_subopt

The module nimna_subopt allows to generate suboptimal foldings within a given energy range around the minimum free energy fold (using suboptimals).

nimna_interactionlist

The module nimna_interactionlist contains procs and iterators for dealing with PairList objects, which describe the base-pairing within a nucleic acid sequence (e.g. pairList). It further provides maximum expected accuracy folding on PairLists (using mea).

nimna_misc

The module nimna_misc provides miscellaneous utility functions.

Future directions

Though nimna already provides basic functionalities for working with nucleic acids, there is still much work to do with regards to making it better, more in-sync with current research and more compatible with the Nim ecosystem.

Here are some things off the top of my head, which should be done to improve nimna

Factor out the artificial immune system engine:
While an artificial immune system algorithm seemed reasonable at the time I wrote the design module and still had its admittably terrible Python predecessor JAWS of iGEM 2015 infamy in mind, it would certainly be better to have a separate package define an Optimizer concept which could then be implemented for a number of different optimization algorithms a user could coose from.
Factor out the nucleic acid random sampling:
Same as above. Just sampling nucleotides from a certain input distribution is not optimal, as schemes such as RNAblueprint (Hammer et al. 2017) exist. Why not provide a concept to define what a random sampler should do, implement a few options and let the user make that choice?
Finally make all of nimna work with JavaScript:
This has been long in the works and fraught with quite a bit of frustration. The ViennaRNA codebase is not exactly Emscripten friendly, and interfacing to Emscripten from the JavaScript backend is painful, but possible. Furthermore, some sane way to manage Emscripten-allocated resources from JavaScript needs to be written. All in all, I still think this is a needed feature, but it will probably take a while.
Automate c2nim wrapper construction:
Currently, the low-level ViennaRNA wrapper is hand-modified c2nim output, mostly to deal with type dependency issues. It would be nice, if we could build that wrapper automatically from a fresh download of ViennaRNA sources, adapted to a chosen set of build options, as ViennaRNA has a few #ifdef inside types, leading to different type layouts for different inputs to configure.
Improve the installation experience:
Currently, nimble install nimna just downloads and builds the ViennaRNA sources with a hardcoded set of sane configuration options. For some usecases different options would be preferable, which is currently not supported by nimnas .nimble file. It would be nice if the user were instead guided through the configuration process upon installation.
Add preset fitness functions:
Currently, no preset fitness functions exist for the design module, requiring the user to know exactly what they want to optimize. A few sane preset fitness functions would probably lower the barrier to effectively use that module.

Non-goals

Though nimna aims to be a library for nucleic acid folding and design, there are a few things it does not wish to be, which should be handled in separate packages.

nucleic acids as genetic material:
nimna is not, and will not be a library dealing with sequencing data, variant calling, or the like. It is meant for dealing with the physical properties of nucleic acids, not their properties as a genetic material.
primer / guide RNA / insert my oligo here design:
while I do encourage nimna to be used to power a library providing primer / guide RNA / other special oligo design capabilities, that functionality should not be baked into nimna, as it encroaches on the territory of nucleic acids as genetic material, which should be in a separate package.
directly involve molecular modeling and dynamics:
while nimna could at some point provide functionalities dealing with nucleic acid tertiary structure and molecular dynamics, the underlying molecular modeling and dynamics should be a separate package. nimna should only deal with nucleic acid structure, not be a molecular dynamics engine.

iGEM disclaimer

nimna is not a project of a Heidelberg iGEM team, future or past. Certainly, it was inspired by the work of the Heidelberg iGEM team 2015 but it is its own project, which I work on, on my own time and which will be maintained and supported, in contrast to some iGEM code out there.