For the second article in our Big Bang Blog Series, we break down the differences between the WIPO Standards ST.25 and ST.26. Read along for details including a direct comparison of the old and the new.
What are the benefits of new WIPO standard ST.26?
- ST.26 allows standardization of sequence listing filing across multiple patent offices.
- Data could be lost during ST.25 transfer to sequence databases, while ST.26 is compliant with current public sequence database requirements.
- With ST.26 sequence listings, Offices and applicants will benefit from automated validation and comprehensive searching capabilities.
What are the main differences between ST.25 and ST.26?
- ST.25-compliant sequence listings could be filed as TXT or PDF files, while ST.26-compliant sequence listings are to be filed in XML (extensible markup language) format.
- ST.25 does not require inclusion of D-amino acids, linear portions of branched sequences, or nucleotide analogs, while ST.26 does.
- ST.25 does permit inclusion of sequences with less than 10 nucleotides and less than 4 amino acids, while such sequences are prohibited in ST.26.
- DNA and RNA molecule types must be further described.
- For more details on ST.25 to ST.26 changes, see the helpful table below reproduced from WIPO’s ST.26 Introduction Webinar: WIPO ST.26: Introduction..
WIPO ST.25 |
WIPO ST.26 |
ASCII .txt with numeric identifiers |
XML with elements and attributes |
Not required to include: – D-amino acids – Linear portions of branched sequences – Nucleotide analogs |
Must include: – D-amino acids – Linear portions of branched sequences – Nucleotide analogs |
Annotation of sequences: – Feature keys only |
Annotation of sequences: – Feature keys and qualifiers |
Permitted to include sequences: – < 10 specifically defined nucleotides – < 4 specifically defined amino acids |
Prohibited sequences: – < 10 specifically defined nucleotides – < 4 specifically defined amino acids |
ALL priority application information may be included |
ONLY the earliest priority application can be included |
ALL applicant and inventor names may be included |
ONLY one applicant AND optionally ONE inventor may be included |
One invention title permitted |
Multiple invention titles permitted, each one in a different language |
Applicant/inventor names and invention titles must be in basic Latin characters |
Applicant/inventor names may be included using any valid Unicode character along with a basic Latin translation or transliteration |
Sequences identified as DNA, RNA, or PRT only |
Sequences identified as DNA, RNA, or AA along with a mandatory mol_type qualifier to further describe the molecule |
Organism names: – Latin genus/species – Virus name – “artificial sequence” – “unknown” |
Organisms names: – Latin genus/species – Virus name – “synthetic construct” – “unidentified” |
“u” represents uracil in nucleotide sequences |
“t” represents uracil in RNA sequences and thymine in DNA sequences |
Amino acid sequences represented by three letter abbreviations |
Amino acid sequences represented by one letter abbreviations |
“n” and “Xaa” variables must have a definition provided in a feature |
Default value assumed for “n” and “X” variables with no definition |
Feature location format not clearly defined |
Strictly defined feature location formats; permits use of “<” and “>” in all sequence types, and “^”, “join”, “order”, and “complement” in nucleotide sequences |
“Mixed mode” sequences permitted – nucleotide sequence with amino acid translation shown below |
NO “mixed mode”; nucleotide translations are included in “translation” qualifiers only |