Ligands for only two human olfactory receptors are known. One of them, OR1D2, binds to Bourgeonal [Malnic B, Godfrey P-A, Buck L-B (2004) The human olfactory receptor gene family. Proc. Natl. Acad. Sci U. S. A. 101: 2584-2589 and Erratum in: Proc Natl Acad Sci U. S. A. (2004) 101: 7205]. OR1D2, OR1D4 and OR1D5 are three full length olfactory receptors present in an olfactory locus in human genome. These receptors are more than 80% identical in DNA sequences and have 108 base pair mismatches among them. We have used L-system mathematics and have been able to show a closely related subfamily of OR1D2, OR1D4 and OR1D5.
Olfactory receptors (ORs) loci in human genome occur in clusters ranging from ~51-105 and they are unevenly spread over 21 chromosomes (1,2). A conservative estimate suggests that 339 full length OR genes and 297 OR pseudogenes are present in these clusters (1). Theoretically, there are two possible ways of OR-odorant molecular binding, viz., (i) each OR binds to a large number of different odorants and (ii) each OR binds to a small number of odorants. In either case, odorant detection at the OR level follows a combinatorial rule, though the stringency of the rule would differ in the two alternatives. Experimentally, it has been demonstrated that each OR recognizes a large number of odorants and perhaps a large class of various concentrations of the odorants tested (3). OR gene (conceptually translated to protein sequences) family (>40% amino acid identity) can be divided into subfamily (>60% identity) and sub-family members might have more than 90% identity (4). Subfamily members are highly similar in DNA and protein sequences, but they are capable of recognizing different odorant molecules. Three full length model subfamily OR members from HORDE database (http://genome.weizmann.ac.il/horde/), OR1D2 (Gene length: 936 bp), OR1D4 (Gene length: 936 bp) and OR1D5 (Gene length: 936 bp) were downloaded from the HORDE database. OR1D2, OR1D4 and OR1D5 were aligned using ClustalW and was found to contain 108 base pair mismatches out of 936 base pairs available (data not shown). OR1D2, OR1D4 and OR1D5 are highly related sequences, therefore, a canonical sequence for this subfamily, termed as `star model' of OR sequence was made by using a computer C program, where 108 gaps were introduced in respective positions following a computer algorithm given at the end of the paper in reference (Fig 1a
A context free L-system (5), was used to generate a 243 bp long DNA sequence.
Set of Variables: A, T, C, and G.
A → CTG, C→CCA, T→TGC and G→GAC Following aforesaid production rule, 1 st and 2 nd iteration, would give CCA (03 bp) and CCACCACTG (09 bp) respectively. Four iterations yield 81 base pair sequences. This is insufficient to answer for 108 mismatches. Five such iterations generate the following 243 bp sequence-
Using a C computer program, nucleotides present in sequence (i) was introduced from 5’-end of the sequence into the star model gaps shown in Fig. 1 sequentially. Briefly, the
Step 1: First, in all the gaps (with 1 bp, 2 bp, 3 bp and 4 bp) in star model, only one nucleotide would be inserted.
Step 2: 1 bp gaps in star model would become 0 gap. Then in the remaining gaps (1 bp, 2 bp and 3 bp) would be filled up and the process would be repeated until all gaps are filled. Results of blast searches show that with the search parameters available in the HORDE website (which could not be changed by remote user), the (ii) sequence showed 92%, 92% and 91% identity with OR1D2, OR1D4 and OR1D5 respectively. Significantly, these insertions do not produce any stop codon in the exon sequence. Therefore, it is clear from the above result that close relative of OR1D2, OR1D4 and OR1D5 subfamily can be generated by this approach. In the next paper, we have shown that L-system can also be used to generated for generating close relative of a single pseudogene present in the same loci that of OR1D2, OR1D4 and OR1D5.
In summary, in this paper, we report that close relatives of OR1D2, OR1D4 and OR1D5 can be generated by using L-system mathematics.
This content is AI-processed based on open access ArXiv data.