biotaphy.common.data_readers

Module containing tools for reading alignment files in various formats.

Module Contents

Functions

create_sequence_list_from_dict(values_dict)

Creates a list of sequences from a dictionary.

get_character_matrix_from_sequences_list(sequences, var_headers=None)

Converts a list of sequences into a character matrix.

load_alignment_from_filename(filename)

Attempts to load an alignment from a file path by guessing schema.

read_csv_alignment_flo(csv_flo)

Reads a CSV file-like object and return a list of sequences and headers.

read_json_alignment_flo(json_flo)

Read a JSON file-like object and return a list of sequences and headers.

read_phylip_alignment_flo(phylip_flo)

Reads a phylip alignment file-like object and return the sequences.

read_table_alignment_flo(table_flo)

Reads a table from a file-like object.

exception biotaphy.common.data_readers.AlignmentIOError[source]

Bases: Exception

Initialize self. See help(type(self)) for accurate signature.

biotaphy.common.data_readers.create_sequence_list_from_dict(values_dict)[source]

Creates a list of sequences from a dictionary.

Parameters

values_dict (dict) – A dictionary of taxon name keys and a list of values for each value.

Note

  • The dictionary should have structure:

    {
        "{taxon_name}" : [{values}]
    }
    
Returns

A list of Sequence objects and None for headers.

Return type

list

Raises

AlignmentIOError – If a dictionary value is not a list.

biotaphy.common.data_readers.get_character_matrix_from_sequences_list(sequences, var_headers=None)[source]

Converts a list of sequences into a character matrix.

Parameters
  • sequences (list of Sequence) – A list of Sequence objects to be converted.

  • var_headers (list of headers, optional) – If provided, uses these as variable headers for the columns in the matrix.

Returns

A matrix of sequence data.

Return type

Matrix

biotaphy.common.data_readers.load_alignment_from_filename(filename)[source]

Attempts to load an alignment from a file path by guessing schema.

Parameters

filename (str) – The file location containing the alignment

Raises

RuntimeError – Raised with the method needed to load the alignment cannot be determined.

Returns

Containing a list of sequences and headers

Return type

tuple

biotaphy.common.data_readers.read_csv_alignment_flo(csv_flo)[source]

Reads a CSV file-like object and return a list of sequences and headers.

Parameters

csv_flo (file-like) – A file-like object with CSV alignment data.

Returns

A list of Sequence objects and headers.

Raises

AlignmentIOError – If the number of columns is inconsistent across the sequences.

biotaphy.common.data_readers.read_json_alignment_flo(json_flo)[source]

Read a JSON file-like object and return a list of sequences and headers.

Parameters

json_flo (file-like) – A file-like object with JSON alignment data.

Note

  • File should have structure:

    {
        "headers" : [{header_names}],
        "values" : [
            {
                "name" : "{taxon_name}",
                "values" : [{values}]
            }
        ]
    }
    
Returns

A list of Sequence objects and headers.

Raises

AlignmentIOError – If headers are provided but they are not a list.

biotaphy.common.data_readers.read_phylip_alignment_flo(phylip_flo)[source]

Reads a phylip alignment file-like object and return the sequences.

Parameters

phylip_flo (file-like) – The phylip file-like object.

Note

  • We assume that the phylip files are extended and not strict (in terms

    of how many characters for taxon names).

  • The phylip file is in the format::

    numoftaxa numofsites seqlabel sequence seqlabel sequence

Returns

A list of Sequence objects.

Raises

AlignmentIOError – If there is a problem creating sequences.

biotaphy.common.data_readers.read_table_alignment_flo(table_flo)[source]

Reads a table from a file-like object.

Parameters

table_flo (file-like) – A file-like object containing table data.

Returns

A list of Sequence objects.

Raises

AlignmentIOError – If there is a problem creating sequences.