Provides two classes, Dseq and Drecord, for handling double stranded DNA sequences. Dseq and Drecord are subclasses of Biopythons Seq and SeqRecord classes, respectively. These classes support the notion of circular and linear DNA.
Bases: Bio.SeqRecord.SeqRecord
Drecord is a double stranded version of the Biopython SeqRecord class. The Drecord object holds a Dseq object describing the sequence. Additionally, Drecord hold meta information about the sequence in the from of a list of SeqFeatures, in the same way as the SeqRecord does. The Drecord can be initialized with a string, Seq, Dseq, SeqRecord or another Drecord. The sequence information will be stored in a Dseq object in all cases. Drecord objects can be read or parsed from sequences in Fasta, Embl or Genbank format.
There is a short representation associated with the Drecord. Drecord(-3) represents a linear sequence of length 2 while Drecord(o7) represents a circular sequence of length 7.
Drecord and Dseq share the same concept of length:
<-- length -->
GATCCTTT
AAAGCCTAG
Parameters : | record : string, Seq, SeqRecord, Dseq or other Drecord object
circular : bool, optional
linear : bool, optional
filter : bool, optional
raw : string, optional
parsed_from : string, optional
|
---|
Examples
>>> from pydna import Drecord
>>> a=Drecord("aaa")
>>> a
Drecord(-3)
>>> a.seq
Dseq(-3)
aaa
ttt
>>> from Bio.Seq import Seq
>>> b=Drecord(Seq("aaa"))
>>> b
Drecord(-3)
>>> b.seq
Dseq(-3)
aaa
ttt
>>> from Bio.SeqRecord import SeqRecord
>>> c=Drecord(SeqRecord(Seq("aaa")))
>>> c
Drecord(-3)
>>> c.seq
Dseq(-3)
aaa
ttt
>>> a.seq.alphabet
IUPACAmbiguousDNA()
>>> b.seq.alphabet
IUPACAmbiguousDNA()
>>> c.seq.alphabet
IUPACAmbiguousDNA()
>>>
Attributes
filtered | bool | Sequence was filtered or not |
parsed_from | string | The string from which the object was parsed (if any). Empty by default. |
raw | string | The string from which the object was parsed (if any). default is ‘not defined’ |
warnings | string | Any warnings issued during parsing. |
Methods
Digest the Drecord object with one or more restriction enzymes.
Parameters : | enzymes : iterable object
|
---|---|
Returns : | fragments : list
|
Examples
>>> import pydna
>>> a=pydna.Drecord("ggatcc")
>>> from Bio.Restriction import BamHI
>>> a.cut(BamHI)
[Drecord(-5), Drecord(-5)]
>>> frag1, frag2 = a.cut(BamHI)
>>> frag1.seq
Dseq(-5)
g
cctag
>>> frag2.seq
Dseq(-5)
gatcc
g
Returns the sequence as a string using a format supported by Biopython SeqIO. Default is “gb” which is short for Genbank.
Examples
>>> import pydna
>>> a=pydna.Drecord("aaa")
>>> a.annotations['date'] = '02-FEB-2013'
>>> a
Drecord(-3)
>>> print a.format()
LOCUS . 3 bp DNA linear UNK 02-FEB-2013
DEFINITION .
ACCESSION <unknown id>
VERSION <unknown id>
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
ORIGIN
1 aaa
//
Returns a circular version of the Drecord object. The underlying Dseq object has to have compatible ends.
Examples
>>> import pydna
>>> a=pydna.Drecord("aaa")
>>> a
Drecord(-3)
>>> b=a.looped()
>>> b
Drecord(o3)
>>>
Returns a new Drecord object which is the reverse complement.
Examples
>>> import pydna
>>> a=pydna.Drecord("ggaatt")
>>> a
Drecord(-6)
>>> a.seq
Dseq(-6)
ggaatt
ccttaa
>>> a.reverse_complement().seq
Dseq(-6)
aattcc
ttaagg
>>>
Returns the SEGUID for the sequence
Examples
>>> import pydna
>>> a=pydna.Drecord("aaa")
>>> a.seguid()
'YG7G6b2Kj/KtFOX63j8mRHHoIlE'
Returns a circular Drecord with a new origin <shift>. This only works on circular Drecords. If we consider the following circular sequence:
The T and the G on the watson strand are linked together as well as the A and the C of the of the crick strand.
if shift is 1, this indicates a new origin at position 1:
new sequence:
Shift is always positive and 0<shift<length, so in the example below, permissible values of shift are 1,2 and 3
>>> import pydna
>>> a=pydna.Drecord("aaat",circular=True)
>>> a
Drecord(o4)
>>> a.seq
Dseq(o4)
aaat
ttta
>>> b=a.shifted(1)
>>> b
Drecord(o4)
>>> b.seq
Dseq(o4)
aata
ttat
Adds a seguid stamp to the description property. This will show in the genbank format. The following string:
SEGUID <seguid>
will be appended to the description property of the Drecord object (string).
Examples
>>> import pydna
>>> a=pydna.Drecord("aaa")
>>> a.stamp()
>>> a.description
'<unknown description> SEGUID YG7G6b2Kj/KtFOX63j8mRHHoIlE'
>>> a.verify_stamp()
True
This function returns a new circular sequence (Drecord object), which has ben rotated in such a way that there is maximum overlap between the sequence and ref, which may be a string, Biopython Seq, SeqRecord object or another Drecord object.
The reason for using this could be to rotate a recombinant plasmid so that it starts at the same position after cloning. See the example below:
Examples
>>> import pydna
>>> a=pydna.Drecord("gaat",circular=True)
>>> a.seq
Dseq(o4)
gaat
ctta
>>> d = a[2:] + a[:2]
>>> d.seq
Dseq(-4)
atga
tact
>>> insert=pydna.Drecord("CCC")
>>> recombinant = (d+insert).looped()
>>> recombinant.seq
Dseq(o7)
atgaCCC
tactGGG
>>> recombinant.synced(a).seq
Dseq(o7)
gaCCCat
ctGGGta
Verifies the SEGUID stamp in the description property is valid. True if stamp match the sequid calculated from the sequence. Exception raised if no stamp can be found.
>>> import pydna
>>> a=pydna.Drecord("aaa")
>>> a.annotations['date'] = '02-FEB-2013'
>>> a.seguid()
'YG7G6b2Kj/KtFOX63j8mRHHoIlE'
>>> print a.format("gb")
LOCUS . 3 bp DNA linear UNK 02-FEB-2013
DEFINITION .
ACCESSION <unknown id>
VERSION <unknown id>
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
ORIGIN
1 aaa
//
>>> a.stamp()
>>> a
Drecord(-3)
>>> print a.format("gb")
LOCUS . 3 bp DNA linear UNK 02-FEB-2013
DEFINITION <unknown description> SEGUID YG7G6b2Kj/KtFOX63j8mRHHoIlE
ACCESSION <unknown id>
VERSION <unknown id>
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
ORIGIN
1 aaa
//
>>> a.verify_stamp()
True
>>>
Writes the Drecord to a file using the format f, which must be a format supported by Biopython SeqIO. Default is “gb” which is short for Genbank.
Filename is the path to the file where the sequece is to be written. The filename is optional, if it is not given, the description property (string) is used together with the format:
If obj is the Drecord object, the default file name will be:
<obj.description>.<f>
If the filename already exists and the sequence it contains is different, a new file name will be used:
<obj.description>_NEW.<f>
Bases: Bio.Seq.Seq
Dseq is a class designed to hold information for a double stranded DNA fragment. Dseq also holds information describing the topology of the DNA fragment (linear or circular).
Dseq is a subclass of the Biopython Seq object. It stores two strings representing the watson (sense) and crick(antisense) strands. two properties called linear and circular, and a numeric value ovhg (overhang) describing the stagger for the watson and crick strand in the 5’ end of the fragment.
The most common usage is probably to create a Dseq object as a part of a Drecord object (see Drecord).
There are three ways of creating a Dseq object directly:
Only one argument (string):
>>> import pydna
>>> pydna.Dseq("aaa")
Dseq(-3)
aaa
ttt
The given string will be interpreted as the watson strand of a blunt, linear double stranded sequence object. The crick strand is created automatically from the watson strand.
Two arguments (string, string):
>>> import pydna
>>> pydna.Dseq("gggaaat","ttt")
Dseq(-7)
gggaaat
ttt
If both watson and crick are given, but not ovhg an attempt will be made to find the best annealing between the strands. There are limitations to this! For long fragments it is quite slow. The length of the annealing sequences have to be at least half the length of the shortest of the strands.
Three arguments (string, string, ovhg=int):
The ovhg parameter controls the stagger at the five prime end:
ovhg=-2
XXXXX
XXXXX
ovhg=-1
XXXXX
XXXXX
ovhg=0
XXXXX
XXXXX
ovhg=1
XXXXX
XXXXX
ovhg=2
XXXXX
XXXXX
Example of creating Dseq objects with different amounts of stagger:
>>> pydna.Dseq(watson="agt",crick="actta",ovhg=-2)
Dseq(-7)
agt
attca
>>> pydna.Dseq(watson="agt",crick="actta",ovhg=-1)
Dseq(-6)
agt
attca
>>> pydna.Dseq(watson="agt",crick="actta",ovhg=0)
Dseq(-5)
agt
attca
>>> pydna.Dseq(watson="agt",crick="actta",ovhg=1)
Dseq(-5)
agt
attca
>>> pydna.Dseq(watson="agt",crick="actta",ovhg=2)
Dseq(-5)
agt
attca
the ovhg parameter has to be given with both watson and crick, otherwise an exception is raised.
>>> pydna.Dseq(watson="agt",ovhg=2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pydna_/dsdna.py", line 169, in __init__
else:
Exception: ovhg defined without crick strand!
The default alphabet is set to Biopython IUPACAmbiguousDNA
The shape of the fragment is set by either:
linear = False, True
or
circular = True, False
Note that both ends of the DNA fragment has to be blunt to set circular = True (or linear = False).
>>> pydna.Dseq("aaa","ttt")
Dseq(-3)
aaa
ttt
>>> pydna.Dseq("aaa","ttt",ovhg=0)
Dseq(-3)
aaa
ttt
>>> pydna.Dseq("aaa", "ttt", linear = False ,ovhg=0)
Dseq(o3)
aaa
ttt
>>> pydna.Dseq("aaa", "ttt", circular = True , ovhg=0)
Dseq(o3)
aaa
ttt
Coercing to string
>>> a=pydna.Dseq("tttcccc","aaacccc")
>>> a
Dseq(-11)
tttcccc
ccccaaa
>>> str(a)
'ggggtttcccc'
The double stranded part is accessible through the dsdata property:
>>> a.dsdata
'ttt'
Drecord and Dseq share the same concept of length:
<-- length -->
GATCCTTT
AAAGCCTAG
Methods
Fill in of five prime protruding ends and chewing back of three prime protruding ends by a DNA polymerase providing both 5’-3’ DNA polymerase activity and 3’-5’ nuclease acitivty (such as T4 DNA polymerase). This in presence of any combination of A, G, C or T. Default are all four nucleotides together.
Examples
>>> import pydna
>>> a=pydna.Dseq("gatcgatc")
>>> a
Dseq(-8)
gatcgatc
ctagctag
>>> a.T4()
Dseq(-8)
gatcgatc
ctagctag
>>> a.T4("t")
Dseq(-8)
gatcgat
tagctag
>>> a.T4("a")
Dseq(-8)
gatcga
agctag
>>> a.T4("g")
Dseq(-8)
gatcg
gctag
>>>
Returns a list of linear Dseq fragments produced in the digestion. If there is not cut, the whole sequence is returned.
Example usage:
>>> from pydna import Dseq
>>> seq=Dseq("ggatccnnngaattc")
>>> seq
Dseq(-15)
ggatccnnngaattc
cctaggnnncttaag
>>> from Bio.Restriction import BamHI,EcoRI
>>> type(seq.cut(BamHI))
<type 'list'>
>>> for frag in seq.cut(BamHI):
... print frag.fig()
Dseq(-5)
g
cctag
Dseq(-14)
gatccnnngaattc
gnnncttaag
>>> seq.cut(EcoRI, BamHI) == seq.cut(BamHI, EcoRI)
True
>>> a,b,c = seq.cut(EcoRI, BamHI)
>>> a+b+c
Dseq(-15)
ggatccnnngaattc
cctaggnnncttaag
>>>
Returns a representation of the sequence, truncated if longer than 40 bp:
Examples
>>> import pydna
>>> a=pydna.Dseq("atcgcttactagcgtactgatcatctgactgactagcgtga")
>>> a
Dseq(-41)
atcgcttactagcgtactga...catctgactgactagcgtga
tagcgaatgatcgcatgact...gtagactgactgatcgcact
>>>
Fill in of five prime protruding end with a DNA polymerase that has only DNA polymerase activity (such as exo-klenow) and any combination of A, G, C or T. Default are all four nucleotides together.
Examples
>>> import pydna
>>> a=pydna.Dseq("aaa", "ttt")
>>> a
Dseq(-3)
aaa
ttt
>>> a.fill_in()
Dseq(-3)
aaa
ttt
>>> b=pydna.Dseq("caaa", "cttt")
>>> b
Dseq(-5)
caaa
tttc
>>> b.fill_in()
Dseq(-5)
caaag
gtttc
>>> b.fill_in("g")
Dseq(-5)
caaag
gtttc
>>> b.fill_in("tac")
Dseq(-5)
caaa
tttc
>>> b=pydna.Dseq("aaac", "tttg")
>>> c=pydna.Dseq("aaac", "tttg")
>>> c
Dseq(-5)
aaac
gttt
>>> c.fill_in()
Dseq(-5)
aaac
gttt
>>>
Returns a tuple describing the structure of the 5’ end of the DNA fragment
>>> import pydna
>>> a=pydna.Dseq("aaa", "ttt")
>>> a
Dseq(-3)
aaa
ttt
>>> a.five_prime_end()
('blunt', '')
>>> a=pydna.Dseq("aaa", "ttt", ovhg=1)
>>> a
Dseq(-4)
aaa
ttt
>>> a.five_prime_end()
("3'", 't')
>>> a=pydna.Dseq("aaa", "ttt", ovhg=-1)
>>> a
Dseq(-4)
aaa
ttt
>>> a.five_prime_end()
("5'", 'a')
>>>
Returns a Sets the Dseq object to circular. This can only be done if the two ends are compatible, otherwise a TypeError is raised.
Examples
>>> import pydna
>>> a=pydna.Dseq("catcgatc")
>>> a
Dseq(-8)
catcgatc
gtagctag
>>> a.looped()
Dseq(o8)
catcgatc
gtagctag
>>> a.T4("t")
Dseq(-8)
catcgat
tagctag
>>> a.T4("t").looped()
Dseq(o7)
catcgat
gtagcta
>>> a.T4("a")
Dseq(-8)
catcga
agctag
>>> a.T4("a").looped()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pydna/dsdna.py", line 357, in looped
if type5 == type3 and str(sticky5) == str(rc(sticky3)):
TypeError: DNA cannot be circularized.
5' and 3' sticky ends not compatible!
>>>
Simulates treatment a nuclease with 5’-3’ and 3’-5’ single strand specific exonuclease activity (such as mung bean nuclease):
ggatcc -> gatcc
ctaggg ctagg
ggatcc -> ggatc
tcctag cctag
>>> import pydna
>>> b=pydna.Dseq("caaa", "cttt")
>>> b
Dseq(-5)
caaa
tttc
>>> b.mung()
Dseq(-3)
aaa
ttt
>>> c=pydna.Dseq("aaac", "tttg")
>>> c
Dseq(-5)
aaac
gttt
>>> c.mung()
Dseq(-3)
aaa
ttt
Returns a tuple describing the structure of the 5’ end of the DNA fragment
>>> import pydna
>>> a=pydna.Dseq("aaa", "ttt")
>>> a
Dseq(-3)
aaa
ttt
>>> a.three_prime_end()
('blunt', '')
>>> a=pydna.Dseq("aaa", "ttt", ovhg=1)
>>> a
Dseq(-4)
aaa
ttt
>>> a.three_prime_end()
("3'", 'a')
>>> a=pydna.Dseq("aaa", "ttt", ovhg=-1)
>>> a
Dseq(-4)
aaa
ttt
>>> a.three_prime_end()
("5'", 't')
>>>
This function returns all DNA sequences found in data. If no sequences are found, an empty list is returned. This is a greedy function, use carefully.
Parameters : | data : string or iterable
filter : bool
obj : string
|
---|---|
Returns : | list :
|
Notes
The data parameter is a string containing:
returns the reverse complement of sequence (string) accepts mixed DNA/RNA
This function is similar the parse funtion but returns only the first sequence found.
Parameters : | data : string
filter : bool
obj : string
|
---|---|
Returns : | Drecord :
|
See also
Notes
The data parameter is similar to the data parameter for parse.
This module provides functions for PCR.
Primers with 5’ tails as well as inverse PCR on circular templates are handled correctly.
Holds information about a PCR reaction involving two primers and one template. This class is used by the Anneal class and is not meant to be instantiated directly.
Parameters : | forward_primer : SeqRecord(Biopython)
reverse_primer : SeqRecord(Biopython)
template : Drecord
saltc : float, optional
fc : float, optional
rc : float, optional
|
---|
Methods
5gctactacacacgtactgactg3
|||||||||||||||||||||| tm 52.6 (dbd) 58.3
5gctactacacacgtactgactg...caagatagagtcagtaaccaca3
3cgatgatgtgtgcatgactgac...gttctatctcagtcattggtgt5
|||||||||||||||||||||| tm 49.1 (dbd) 57.7
3gttctatctcagtcattggtgt5
Returns : | figure:string :
|
---|
Notes
tm is the melting temperature (Tm) calculated according to SantaLucia 1998 [1] for each primer.
(dbd) is Tm calculation for enzymes with dsDNA binding domains like Pfu-Sso7d [2]. See [3] for more information.
References
[R1] |
|
[R2] |
|
[R3] | http://www.thermoscientificbio.com/webtools/tmc/ |
Returns a Drecord object containing flankdnlength bases downstream of the reverse primer footprint. Truncated if the template is not long enough.
<---- flankdn ------>
3actactgactatctTAATAA5
||||||||||||||
acgcattcagctactgtactactgactatctatcgtacatgtactatcgtat
Returns a Drecord object containing flankuplength bases upstream of the forward primer footprint, Truncated if the template is not long enough.
<--- flankup --->
5TAATAAactactgactatct3
||||||||||||||
acgcattcagctactgtactactgactatctatcg
Returns the PCR product as a Drecord object. Primers are marked as SeqFeatures(Biopython) associated with the sequence.
Returns a string containing a text representation of two proposed PCR programs. The first program is adapted for Taq poymerase, while the second is adapted for Pfu-Sso7d.
Taq (rate 30 nt/s)
Three-step| 30 cycles | |SantaLucia 1998
94.0°C |94.0°C | |SaltC 50mM
__________|_____ 72.0°C |72.0°C|
04min00s |30s \ ________|______|
| \ 46.0°C/ 0min 1s|10min |
| \_____/ | |
| 30s | |4-8°C
Pfu-Sso7d (rate 15s/kb)
Three-step| 30 cycles | |Breslauer1986,SantaLucia1998
98.0°C |98.0°C | |SaltC 50mM
__________|_____ 72.0°C |72.0°C|Primer1C 1µM
00min30s |10s \ 61.0°C ________|______|Primer2C 1µM
| \______/ 0min 0s|10min |
| 10s | |4-8°C
Parameters : | primers : iterable containing SeqRecord objects
template : Drecord object
homology_limit : int, optional
max_product_size: int, optional :
|
---|
Examples
>>> import pydna
>>> template = pydna.Drecord("tacactcaccgtctatcattatctactatcgactgtatcatctgatagcac")
>>> from Bio.SeqRecord import SeqRecord
>>> p1=pydna.read(">p1\ntacactcaccgtctatcattatc", obj="SeqRecord")
>>> p2 = pydna.read(">p1\ncgactgtatcatctgatagcac", obj="SeqRecord").reverse_complement()
>>> pydna.Anneal((p1,p2), template)
Anneal(amplicons = 1)
>>> pydna.Anneal((p1,p2), template).amplicons
[Amplicon(51)]
>>> amplicon_list = pydna.Anneal((p1,p2), template).amplicons
>>> amplicon = amplicon_list.pop()
>>> amplicon
Amplicon(51)
>>> print amplicon.detailed_figure()
5tacactcaccgtctatcattatc...cgactgtatcatctgatagcac3
|||||||||||||||||||||| tm 50.6 (dbd) 60.5
3gctgacatagtagactatcgtg5
5tacactcaccgtctatcattatc3
||||||||||||||||||||||| tm 49.4 (dbd) 58.8
3atgtgagtggcagatagtaatag...gctgacatagtagactatcgtg5
>>> prod = amplicon.pcr_product()
>>> prod.annotations['date'] = '02-FEB-2013'
>>> print prod
Drecord
circular: False
size: 51
ID: 51bp U96+TO06Y6pFs74SQx8M1IVTBiY
Name: 51bp_PCR_prod
Description: Primers p1 <unknown name>
Number of features: 2
/date=02-FEB-2013
Dseq(-51)
tacactcaccgtctatcatt...actgtatcatctgatagcac
atgtgagtggcagatagtaa...tgacatagtagactatcgtg
>>>
Attributes
Methods
Returns the melting temperature (Tm) of the primer using the basic formula.
Parameters : | primer : string
|
---|---|
Returns : | tm : int |
Examples
>>> from pydna.amplify import basictm
>>> basictm("ggatcc")
20
>>>
pcr is a convenience function for Anneal to simplify its usage, especially from the command line. If more than one PCR product is formed, an exception is raised.
args is any iterable of sequences or an iterable of iterables of sequences. args will be greedily flattened.
Parameters : | args : iterable containing sequence objects
limit : int = 13, optional
|
---|---|
Returns : | product : Drecord
|
Notes
sequences in args could be of type:
string Seq SeqRecord Drecord
The last sequence will be interpreted as the template all preceeding sequences as primers.
This is a powerful function, use with care!
Examples
>>> import pydna
>>> template = pydna.Drecord("tacactcaccgtctatcattatctactatcgactgtatcatctgatagcac")
>>> from Bio.SeqRecord import SeqRecord
>>> p1 = pydna.read(">p1\ntacactcaccgtctatcattatc", obj="SeqRecord")
>>> p2 = pydna.read(">p1\ncgactgtatcatctgatagcac", obj="SeqRecord").reverse_complement()
>>> pydna.pcr(p1, p2, template)
Drecord(-51)
>>> pydna.pcr([p1, p2], template)
Drecord(-51)
>>> pydna.pcr((p1,p2,), template)
Drecord(-51)
>>>
Returns the melting temperature (Tm) of the primer using the nearest neighbour algorithm. Formula and thermodynamic data is taken from Breslauer 1986. These data are no longer widely used.
Breslauer 1986, table 2 [1]
pair | dH | dS | dG |
---|---|---|---|
AA/TT | 9.1 | 24.0 | 1.9 |
AT/TA | 8.6 | 23.9 | 1.5 |
TA/AT | 6.0 | 16.9 | 0.9 |
CA/GT | 5.8 | 12.9 | 1.9 |
GT/CA | 6.5 | 17.3 | 1.3 |
CT/GA | 7.8 | 20.8 | 1.6 |
GA/CT | 5.6 | 13.5 | 1.6 |
CG/GC | 11.9 | 27.8 | 3.6 |
GC/CG | 11.1 | 26.7 | 3.1 |
GG/CC | 11.0 | 26.6 | 3.1 |
Parameters : | primer : string
|
---|---|
Returns : | tm : float |
References
[R4] | K.J. Breslauer et al., “Predicting DNA Duplex Stability from the Base Sequence,” Proceedings of the National Academy of Sciences 83, no. 11 (1986): 3746. |
Returns the tm for a primer using a formula adapted to polymerases with a DNA binding domain.
Parameters : | primer : string
primerc : float
saltc : float, optional
thermodynamics : bool, optional
|
---|---|
Returns : | tm : float
tm,dH,dS : tuple
|
Returns the melting temperature (Tm) of the primer using the nearest neighbour algorithm. Formula and thermodynamic data is taken from SantaLucia 1998. This implementation gives the same answer as the one provided by Biopython (See Examples).
Thermodynamic data used:
pair | dH | dS |
---|---|---|
AA/TT | 7.9 | 22.2 |
AT/TA | 7.2 | 20.4 |
TA/AT | 7.2 | 21.3 |
CA/GT | 8.5 | 22.7 |
GT/CA | 8.4 | 22.4 |
CT/GA | 7.8 | 21.0 |
GA/CT | 8.2 | 22.2 |
CG/GC | 10.6 | 27.2 |
GC/CG | 9.8 | 24.4 |
GG/CC | 8.0 | 19.9 |
Parameters : | primer : string
|
---|---|
Returns : | tm : float
|
References
[R5] |
|
Provides functions for assembly of sequences by homologous recombination. Given a list of sequences (Drecords), all sequences will be analyzed for overlapping regions of DNA (common substrings).
The assembly algorithm is based on forming a network where each overlapping sequence forms a node and intervening sequences form edges.
Then all possible linear or circular assemblies will be returned in the order of their length.
Accepts a list of Drecords and tries to assemble them into a circular assembly by homologous recombination based on shared regions of homology with a minimum length given by limit.
Parameters : | form_rec_list : list
limit : int, optional
|
---|---|
Returns : | frecs, cp : tuple
|
Accepts a list of Drecords and tries to assemble them into a linear assembly by homologous recombination based on shared regions of homology with a minimum length given by limit.
Parameters : | form_rec_list : list
limit : int, optional
|
---|---|
Returns : | frecs, lp : tuple
|
Provides a class for downloading sequences from genbank.
Class to facilitate download from genbank.
Parameters : | users_email : string
proxy : string, optional
tool : string, optional
|
---|
Methods
Download a genbank record using a Genbank object.
item is a string containing one genbank acession number [1] for a nucleotide file:
References
[R6] | http://www.dsimb.inserm.fr/~fuchs/M2BI/AnalSeq/Annexes/Sequences/Accession_Numbers.htm |
This module provides a class for opening a sequence using an editor that accepts a path as a command line argument.
ApE - A plasmid Editor [R7] is such an editor.
>>> import pydna
>>> ape = pydna.Editor("tclsh /home/bjorn/ApE/apeextractor/ApE.vfs/lib/app-AppMain/AppMain.tcl")
>>> ape.open("aaa")
>>>
This module provides miscellaneous functions.
This function tries to copy all features in source_seq and copy them to target_seq. Source_sr and target_sr are objects with a features property, such as Drecord or Biopython SeqRecord.
Parameters : | source_seq : SeqRecord or Drecord
target_seq : SeqRecord or Drecord
|
---|---|
Returns : | bool : True
|
Compares two or more DNA sequences for equality i.e. they represent the same DNA molecule. Comparisons are case insensitive.
Parameters : | args : iterable
circular : bool, optional
linear : bool, optional
|
---|---|
Returns : | eq : bool
|
Notes
Compares two or more DNA sequences for equality i.e. if they represent the same DNA molecule.
Two linear sequences are considiered equal if either:
Two circular sequences are considiered equal if:
AND
The topology for the comparison can be set using one of the keywords linear or circular to True or False.
If circular or linear is not set, it will be deduced from the topology of each sequence for sequences that have a linear or circular attribute (like Dseq and Drecord).
Examples
>>> from pydna import eq, Drecord
>>> eq("aaa","AAA")
True
>>> eq("aaa","AAA","TTT")
True
>>> eq("aaa","AAA","TTT","tTt")
True
>>> eq("aaa","AAA","TTT","tTt", linear=True)
True
>>> eq("Taaa","aTaa", linear = True)
False
>>> eq("Taaa","aTaa", circular = True)
True
>>> a=Drecord("Taaa")
>>> b=Drecord("aTaa")
>>> eq(a,b)
False
>>> eq(a,b,circular=True)
True
>>> a.circular=True
>>> b.circular=True
>>> eq(a,b)
True
>>> eq(a,b,circular=False)
False
>>> eq(a,b,linear=True)
False
>>> eq(a,b,linear=False)
True
>>> eq("ggatcc","GGATCC")
True
>>> eq("ggatcca","GGATCCa")
True
>>> eq("ggatcca","tGGATCC")
True
Shift the origin of seq which is assumed to be a circular sequence.
Parameters : | seq : string, Biopython Seq, Biopython SeqRecord, Dseq or Drecord
|
---|---|
Returns : | new_seq : string, Biopython Seq, Biopython SeqRecord, Dseq or Drecord
|
Examples
>>> import pydna
>>> pydna.shift_origin("taaa",1)
'aaat'
>>> pydna.shift_origin("taaa",0)
'taaa'
>>> pydna.shift_origin("taaa",2)
'aata'
>>> pydna.shift_origin("gatc",2)
'tcga'
Synchronize two circular sequences.
Parameters : | seq : string, Biopython Seq, Biopython SeqRecord, Dseq or Drecord
ref : string, Biopython Seq, Biopython SeqRecord, Dseq or Drecord
|
---|---|
Returns : | sequence : string, Biopython Seq, Biopython SeqRecord, Dseq or Drecord
This function tries to rotate the circular sequence seq : so that it has a maximum overlap with ref. : |
Examples
>>> import pydna
>>> pydna.sync("taaatc","aaataa")
'aaatct'
>>> pydna.sync("taaatc","aaataa")
'aaatct'
>>> pydna.sync("taaat","aaataa")
'aaatt'