Macromolecular Crowding in Cell Biology Simulations

Satya Nanda Vel Arjunan

Graduate School of Media and Governance and Institute for Advanced Biosciences, Keio University, Tsuruoka, 997-0035, Japan.

Abstract

Recent experimental and computational works suggest molecular crowding is an important issue because it affects both reaction and diffusion of molecules in the cell medium. Lattice-based simulation approaches can model the effects of molecular crowding since it represents each molecule explicitly. However, this advantage also presents a significant drawback in computational cost. Parallelization of lattice-based simulation algorithm can alleviate the computational cost but it cannot scale for high number of processors. Here, we present a new synchronization approach for lattice-based simulation algorithm that permits better scalability than other existing methods when using high number of processors.

Key words: Macromolecular crowding, Monte-Carlo simulation, Spatial simulation, Enzymatic reactions, Diffusion

1 Introduction

Unlike the solutions in the test tube, in vivo environments contain high total macromolecular content resulting in molecular crowding [1]. Since large amount of volume is occupied by a small number of molecules, the media is crowded instead of having high concentrations of a certain molecular species. Hence, even a system such as gene expression which apparently involves a small number of molecules may actually be crowded as a result of the space occupied by macromolecules that are external to the system. Since the total macromolecular concentration is 50-400 mg/ml, only 60-95% of the total intracellular volume is available for other molecules [2].

Macromolecular crowding enhances protein association and increases rate of protein folding and refolding. Diffusion of macromolecules in the cytoplasm is about 5-20 times slower than in saline solutions [3]. In some cases, the diffusing molecule never returns to its origin. This occurrence, known as anomalous diffusion was shown through simulation [4]. Additionally, the chemical potential of proteins increases when there is a macromolecule added to the solution. The increased potential causes mutual exclusion of proteins from the media. Crowding also causes mutual impenetrability of the reactants, called excluded volume effect, and steric repulsion in the reaction media [5]. The value of Michaelis-Menten constant decreases in the presence of a macromolecule in UniUni, BiUni and competitive inhibitory reactions. This means that the affnity of the substrate(s) towards the enzyme increases when other macromolecule is present. Furthermore, in gene expression systems, the excluded volume effect was found to increase the binding and activity of RNA polymerase to DNA. Several groups [6,4] showed through simulation that increased concentration makes reaction kinetics fractal and thus, breaking the law of mass action and power law approximation.

Considering the above effects, it is crucial that we incorporate the effects of molecular crowding during modeling and simulation to produce more accurate representation, prediction and results. From our recent review of existing spatial simulation approaches [7], we concluded that lattice-based algorithms have good potential to represent the effects of molecular crowding. Some of the properties of lattice-based algorithm include the ability to generate excluded volume effect and stochasticity of reaction and diffusion of molecules. Nonetheless, lattice-based simulation algorithms present a huge drawback in computational cost because each molecule needs to be simulated explicitly. Although parallelization of lattice based simulation algorithm is relatively straight forward, synchronization of data at each time step is still not computationally effcient when scaling to high number of processors because some of the processors need to be blocked from execution using mutexes. The computational cost due to such blocking is signiﬁcant because it is accumulated during simulation from the large number of time steps. In this work, we present a lattice-based simulation algorithm that uses an atomic primitive called CompareAndSwap instead of mutexes. From our simulation results, this approach reduces the computational time when scaling to high number of processors.

In the following section, we describe the proposed lattice-based simulation algorithm with reaction and diffusion of molecules. The data structure and the new synchronization algorithm using CompareAndSwap is presented after that. Some simulation results comparing the performance of our algorithm with the conventional approach using mutexes are also provided. Finally, we discuss the drawbacks and potential applications of this new algorithm.

2 Reaction-Diffusion Algorithm

In this section a new lattice-based approach is proposed to consider the molecular crowding effects of the cell. The properties of this approach include the ability to simulate both reaction and diffusion of molecules stochastically, multiscale and representation of molecules with varied size and shape.

The proposed approach is multiscale since it conducts simulation on both macroscopic and microscopic scales simultaneously. At the macroscopic scale, we have very small molecules such as ions that can be assumed to be homogeneously located in the cell. The threshold for the assumption that the molecules are small and homogeneous is the spacing of the lattice. For example, if the lattice spacing is 1 nm, any molecule which has a diameter smaller than 1 nm is assumed to be homogeneous and placed in the macroscopic scale. Based on the work by [8], the size of the lattice can be set as 1 nm since it is the typical size of lipids. If, however, the smallest molecule in all the reactions is larger than 1 nm, the lattice spacing can be increased to the size of the smallest molecule. In this case, there are no molecules simulated at the macroscopic scale.

Molecules that are equivalent or larger than 1 nm are simulated at the microscopic scale. These molecules are placed on the periodic boundary condition lattice (i.e. the shape corresponds to a torus) and could occupy lattice(s) proportional to their sizes and basic shapes. The smallest molecule on the lattice is called the tracer. For molecules for which their sizes are not available, their molecular weights, normalized to the that of the tracer, are used to estimate their sizes. An example of a lattice occupied by microscopic-scale molecules is shown in Fig. 1.

Initially, the overall lattice size is determined using the volume of the actual cell. The microscopic molecules are then placed randomly in the 2D lattice.

Fig. 1. A 2D Lattice

P, E and T are diffusing molecules with varying shapes and sizes. O is a non-diffusing obstacle molecule which occupies space in the lattice. T is the tracer molecule since it only occupies a single lattice site.

All molecules will be selected randomly at each time step. The size of the time-step is calibrated based on the spacing of the lattice following the method used by [8].

For each selected molecule, a random number is generated. The molecule is moved if the random number is smaller than the probability, p = Dmolecule/Dtracer, where Dmolecule and Dtracer are the diffusion constants of the selected molecule and tracer respectively. The diffusion constants can be determined from (1) with available molecular weight and density.

In the Brownian motion equation above, k is the Boltzmann constant, T is the absolute temperature, µ is the solvent’s viscosity, N A is the Avogadro’s number, ρ is the density of the molecule and M is the molecular weight of the molecule.

The molecule can move in one of either 4 directions. The direction of movement is selected using another random number. The molecule displaces in proportion to the spacing of the lattice at each time step.

If there is another molecule blocking its movement, one of the following procedures is followed:

If the blocking molecule is a bimolecular reactant pair of the moving molecule, generate another random number. If this random number is less than their reaction probability, execute the reaction (e.g. form a complex at the movement destination), otherwise, place the moving molecule at its original position.
If the blocking molecule is not a bimolecular reactant pair, the moving molecule will stay at its original position.

On the other hand, if there is no other molecule blocking, one of the following procedures is followed:

If the moving molecule is a bimolecular reactant and its reactant pair is a macroscopic molecule, generate another random number. If this random number is less than their reaction probability, execute the reaction (e.g. form a complex at the movement destination, reduce the number of the macroscopic reactant by 1), otherwise, move the moving molecule to its destination position.
If the moving molecule is a unimolecular reactant, generate another random number. If this random number is smaller than the reaction probability, execute the reaction (e.g. dissociate the molecule to two molecules, one in the destination location, another in the original position), otherwise, move the moving molecule to its destination position.

If the moving molecule is not a reactant, simply move it to its destination if there is no other molecule blocking.

3 Symmetric Memory Multiprocessing of the Lattice-Based Algorithm

A straight forward approach to parallelizing lattice-based simulation algorithms is to partition the lattice into equal areas (or volumes for 3D lattice) and assign the partitions to different processors. An example of a lattice with three partitions for three processors is shown in Fig. 2. The processors need to synchronize when the molecules diffuse from on partition to another. In our approach, we created shared regions within the partitions that will be used by two processors to permit synchronization. In Fig. 2. regions b and c are shared by processor 0 and 1, while regions e and f are shared by processor 1 and 2. The width of the shared regions is the radius of the largest diffusing molecule in the lattice.

In the previous section, we mentioned that the molecules are placed at random locations within the lattice during initialization. During the placement, the molecules are classiﬁed into the regions in which they are placed. At each time step, a processor executes reaction and diffusion of molecules within

Fig. 2. Partitioning of Lattice into Regions for Symmetric Processing

P0, P1 and P2 are the three partitions of the lattice assigned to processor 0, 1 and 2 respectively. a, b .. g are the regions within the partitions which are executed sequentially by the processor. For example, processor 0 must complete executing diffusion and reaction of molecules within the region a before executing within region b. x is the radius of the largest diffusing molecule in the lattice. y is obtained by dividing the length of the lattice, z by the number of processors.

each region of its partition sequentially. For example, processor 0 executes the molecules within region a before proceeding to region b. However, the processor needs to ensure that when it is executing the molecules within a shared region, another processor is not excutinng the molecules within its adjacent region. In this case, processor 0 must only excute within region b when processor 1 is not excuting region c. This can be implemented by creating busy ﬂags for each shared region and checking the ﬂags of the adjacent shared region before excuting a shared region.

Before proceeding to the next time step, all processors must have completed executing the molecules within their partitions. To implement this, a conventional approach is to use a barrier at each time step to ensure that all processors (i.e., executing threads) have reached that position in the code before proceeding to the next time step. The barrier consists of mutexes which presents signiﬁcant computational cost when scaling to high number of processors and high number of time steps. In our approach, we avoided using mutexes by adopting a shared counter and incrementing it via an atomic primitive function called CompareAndSwap. An atomic primitive function is a special kind of function that is once called, cannot be interrupted until it has ﬁnished executing. The CompareAndSwap function takes as arguments the address of a shared memory location, an expected value, and a new value. If the shared location currently holds the expected value, it is assigned the new value atomically. A Boolean return value indicates whether the replacement occured. In our approach, once a processor ﬁnished excuting all the molecules within its partition, it increments this shared counter using CompareAndSwap and enters a loop to wait for a signal to execute the partition again for the next time step. The signal for the next time step is given by a master processor only when the shared counter equals the number of processors (i.e., all the processors have ﬁnished executing their respective partitions). Before issuing the signal, the master processor resets the shared counter to 0. \

4 Results

To evaluate the performance of our lattice-based approach for symmetric multiprocessing, we used the MinC, MinD and MinE oscillating protein model in Escherichia coli [9]. We used a lattice resolution of 30 nm with E. coli volume of 0.88 ﬂ. Initial number of molecules for MinC is 4400, MinD is 4400, MinE is 2200, MinCD is 0. We also used 5000 non-diffusing and non-reacting obstacle molecules. The algorithm was implemented as a plugin module to the E-Cell Simulation Environment version 3 and run on the IBM pSeries 650 with 8 processors and 40 GB of memory. We compared the performance of our algorithm with mutexes and with the CompareAndSwap function. The ratio of simulation time with real time on average for 1 minute of simulation time is 2.35 for mutexes but it increased to 2.51 for CompareAndSwap.

5 Discussions

We have presented a new lattice-based 2D simulation algorithm for symmetric multiprocessing in this work. This algorithm integrates the effects of molecular crowding such as excluded volume and varying shapes and sizes of molecules. Molecules are diffused on the lattice based on Brownian motion and reacts according to their reaction probability. For synchronization of processors at each time step, we used an atomic function called CompareAndSwap and have achieved favorable results over conventional approach using mutexes. The algorithm can also be extended for 3D simulation in a straight forward manner. As a result of it scalability to high number of processor, we can further introduce other computationally expensive properties such as weak interactions between molecules.

Acknowledgments

We would like to thank Dr. Koichi Takahashi for useful discussions.

References

[1] D. Hall, A. Minton, Macromolecular crowding: qualitative and semiquantitative successes, quantitative challenges., Biochim. Biophys. Acta. 1649 (2) (2003) 127-39.

[2] A. Fulton, How crowded is the cytoplasm?, Cell 30 (2) (1982) 345–7.

[3] M. Elowitz, M. Surette, P. Wolf, J. Stock, S. Leibler, Protein mobility in the cytoplasm of Escherichia coli., J. Bacteriol. 181 (1) (1999) 197–203.

[4] H. Berry, Monte carlo simulations of enzyme reactions in two dimensions: fractal kinetics and spatial segregation., Biophys. J. 83 (4) (2002) 1891–901.

[5] A. Minton, The inﬂuence of macromolecular crowding and macromolecular conﬁnement on biochemical reactions in physiological media., J. Biol. Chem. 276 (14) (2001) 10577–80.

[6] S. Schnell, T. Turner, Reaction kinetics in intracellular environments with macromolecular crowding: simulations and rate laws., Prog. Biophys. Mol. Biol. 85 (2-3) (2004) 235–60.

[7] K. Takahashi, S. N. V. Arjunan, M. Tomita, Space in systems biology of signaling pathways – towards intracellular molecular crowding in silico, FEBS Lett. 579 (3) (2005) 1783–1788.

[8] I. Tremmel, H. Kirchhoff, E. Weis, G. Farquhar, Dependence of plastoquinol diffusion on the shape, size, and density of integral thylakoid proteins., Biochim. Biophys. Acta 1607 (2-3) (2003) 97–109.

[9] R. A. Kerr, H. Levine, T. J. Sejnowski, W.-J. Rappel, Division accuracy in a stochastic model of min oscillations in escherichia coli, Proc. Natl. Acad. Sci. USA 103 (2) (2005) 347–352.