write_fasta

write_fasta writes a set of sequences to a FASTA file. It accepts sequence data as either:

  • a dictionary {header: sequence, ...}, or

  • a list of [header, sequence] pairs.

It is also possible to have protfasta.read_fasta() write its sanitized result directly to disk via the output_filename keyword, which simply calls write_fasta internally.

Keyword arguments

  • filename - destination path. Conventionally ends with .fasta or .fa but this is not enforced.

  • linelength (default 60) - maximum residues per line. Values below 5 are clamped to 5. Set to 0, None or False to write each sequence on a single line.

  • append_to_fasta (default False) - when True, new entries are appended to an existing file rather than overwriting it.

Performance notes

write_fasta writes output in line-length chunks with a 1 MiB write buffer, which makes it suitable for very large outputs (tens of millions of sequences and beyond). An empty sequence raises a ProtfastaException rather than being silently written out.

For usage examples see the Examples page.

Documentation

protfasta.write_fasta(fasta_data: dict[str, str] | list[list[str]], filename: str, linelength: int | bool | None = 60, append_to_fasta: bool = False) None[source]

Write sequences to a FASTA file.

Accepts sequence data as either a dictionary (header -> sequence) or a list of [header, sequence] pairs and writes a standards- compliant FASTA file to filename.

Parameters:
  • fasta_data (dict[str, str] or list[list[str]]) – Sequence data. If a dictionary, keys are headers and values are amino-acid sequences. If a list, each element must be a two-element list [header, sequence].

  • filename (str) – Destination file path. Should conventionally end with .fasta or .fa, but this is not enforced.

  • linelength (int, bool, or None, optional) – Maximum number of residues per line in the output. Default is 60 (the UniProt convention). Values below 5 are clamped to 5. Set to 0, None, or False to write each sequence on a single line.

  • append_to_fasta (bool, optional) – If True, new entries are appended to filename if it already exists; otherwise the file is created. If False (default), any existing file is overwritten.

Return type:

None

Raises:

ProtfastaException – If a sequence is empty or a list element does not contain exactly two items.