Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2025

Create adapter(s) for standard bioinformatics database files

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Common bioinformatics files, used mostly in genomic medicine, and life sciences research are VCF, SAM, and FASTQ/FASTA files [1,2,3,4].

      They are structured text files, with metadata headers, and (generally) column oriented layout.

      Having calcite support for these formats would enable it to serve as the front end for processing of a very large body of important data, and to facilitate the integration of these datasets into a downstream frameworks that incorporate or use calcite.

      This issue will serve as the parent issues for each format that will be implemented (SAM, VCF, etc.)

      1. SAM file format, https://en.wikipedia.org/wiki/SAM_(file_format)
      2. VCF file format, https://en.wikipedia.org/wiki/Variant_Call_Format
      3. FASTQ file format, https://en.wikipedia.org/wiki/FASTQ_format
      4.Other, https://bioinf.comav.upv.es/courses/sequence_analysis/sequence_file_formats.html

      Attachments

        Activity

          People

            ebegoli Edmon Begoli
            ebegoli Edmon Begoli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 8,736h
                8,736h
                Remaining:
                Remaining Estimate - 8,736h
                8,736h
                Logged:
                Time Spent - Not Specified
                Not Specified