Skip to content Skip to sidebar Skip to footer

How Can I Determine Whether An Email Header Is Base64 Encoded

Using the email.header package, I can do the_text,the_charset = decode_header(inputText) to get the character set of the email header, where the inputText was retrieved by a comma

Solution 1:

Encoded-Message header can consist of 1 or more lines, and each line can use a different encoding, or no encoding at all.

You'll have to parse the type of encoding out yourself, one per line. Using a regular expression:

import re

quopri_entry = re.compile(r'=\?[\w-]+\?(?P<encoding>[QB])\?[^?]+?\?=', flags=re.I)
encodings = {'Q': 'quoted-printable', 'B': 'base64'}

def encoded_message_codecs(header):
    used = []
    for line in header.splitlines():
        entry = quopri_entry.search(line)
        if not entry:
            used.append(None)
            continue
        used.append(encodings.get(entry.group('encoding').upper(), 'unknown'))
    return used

This returns a list of strings drawn from quoted-printable, base64, unknown or None if no Encoded-Message was used for that line.


Post a Comment for "How Can I Determine Whether An Email Header Is Base64 Encoded"