Uploaded image for project: 'UGENE'
  1. UGENE
  2. UGENE-6125

Update DIAMOND to v0.9.22

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: master
    • Fix Version/s: 1.31
    • Component/s: NGS, Workflow
    • Labels:
      None
    • Story Points:
      3
    • Epic Link:
    • Sprint:
      DEV-31-3, DEV-31-4
    • Affect Type:
      Userdefined

      Description

      In short, the following should be done:

      • Update the executable files in the 64-bit Linux and macOS external tools packages.
      • Update the command launches for DIAMOND build and classify.
      • Update the element description a little bit.
      • Update the parameters.

      DIAMOND executable on 64-bit Linux

      The executable file provided on the DIAMOND website does not work on all systems by default. The same happened when I tried to compile the source code. The executable file worked only on the system where it was build. It is required to:

      • make the executable file non-dependent on a particular Linux system
      • add the updated files to the 64-bit Linux external tool package

      DIAMOND executable on 64-bit macOS

      • build the source code
      • add the updated files to the 64-bit macOS external tool package

      Command launches

      Since DIAMOND version 0.9.19 taxonomy files are passed to the tool during building of the database, not aligning. So modify the tool launches correspondingly (parameters "taxonmap" and "taxonnodes").

      Build DIAMOND Database:

      diamond makedb .../uniref50.fasta.gz -d .../uniref50.dmnd --taxonmap .../data/ngs_classification/taxonomy/prot.accession2taxid.gz --taxonnodes .../data/ngs_classification/taxonomy/nodes.dmp
      

      Classify Sequences with DIAMOND:

      diamond blastx -d .../uniref50.dmnd -f 102 other_parameters
      

      Update the workflow element description in the Property Editor

      In general, DIAMOND is a sequence aligner for protein and translated DNA searches similar to the NCBI BLAST software tools. However, it provides a speedup of BLAST ranging up to x20,000.
      
      Using this workflow element one can use DIAMOND for taxonomic classification of short DNA reads and longer sequences such as contigs. The  lowest common ancestor (LCA) algorithm is used for the classification.
      

      Update DIAMOND parameters

      1. Modify the default value of the "Block size" parameter to "0.5" to increase chances that DIAMOND will be able to run on a common computer by default.
      2. Modify description of the "Expected value" parameter:
        Maximum expected value to report an alignment (--evalue/-e).
        
      3. Modify description of the "Output file" parameter:
        Specify the output file name.
        
        The output file is a tab-delimited file with the following fields:
        * Query ID
        * NCBI taxonomy ID (0 if unclassified)
        * E-value of the best alignment with a known taxonomy ID found for the query (0 if unclassified)
        
      4. Add a new parameter "Top alignments percentage":
        • Put this parameter under "Sensitive mode" in the Property Editor.
        • The value should be input via a spin box with integer values >= 0, <= 100. Put "%" near the value.
        • The default value is "10".
        • The description should be the following:
          DIAMOND uses the lowest common ancestor (LCA) algorithm for taxonomy classification of the input sequences. This parameter specifies what alignments should be taken into account during the calculations (--top).
          
          For example, the default value "10" means to take top 10% of the best hits (i.e. sort all query/subject-alignments by score, take top 10% of the alignments with the best score, calculate the lowest common ancestor for them).
          

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              atiunov Aleksey Tiunov [X] (Inactive)
              Reporter:
              oigl Olga Golosova
              Assigned Tester:
              Dmitrii Sukhomlinov
              Watchers:
              0 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: