Koozali.org formerly Contribs.org

Problem Reformime and ANSI_X3.4-1968?

Problem Reformime and ANSI_X3.4-1968?
« on: October 31, 2019, 12:10:42 PM »
I'm trying to save an email attachment automatically to map with qmail and reformime. I use the following script:

Code: [Select]
#!/usr/bin/env bash

# This script process mail message attachments from stdin MIME message
# Extract all PDF files attachments
# and return the MIME message to stdout for further processing

# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C

# Setting the destination path for saved attachments
attachements='/home/e-smith/files/users/user/attachements/'

trap 'rm -f -- "$mailmessage"' EXIT # Purge temporary mail message

# Create a temporary message file
mailmessage="$(mktemp)"

# Save stdin message to tempfile
cat > "$mailmessage"

# Iterate all MIME sections from the message
while read -r mime_section; do

  # Get all section info headers
  section_info="$(reformime -s "$mime_section" -i <"$mailmessage")"

  # Parse the Content-Type header
  content_type="$(grep 'content-type' <<<"$section_info" | cut -d ' ' -f 2-)"

  # Parse the Content-Name header (if available)
  content_name="$(grep 'content-name' <<<"$section_info" | cut -d ' ' -f 2-)"

  # Decode the value of the Content-Name header
  content_name="$(reformime -h "$content_name")"

  if [[ $content_type = "application/pdf" || $content_name =~ .*\.[pP][dD][fF] ]]; then
    # Attachment is a PDF
    if [ -z "$content_name" ]; then
      # The attachment has no name, so create a random name
      content_name="$(mktemp --dry-run unnamed_XXXXXXXX.pdf)"
    fi
    # Prepend the date to the attachment filename
    filename="$(date +%Y%m%d)_$content_name"

    # Save the attachment to a file
    reformime -s "$mime_section" -e <"$mailmessage" >"$attachements/$filename"
  fi

done < <(reformime < "$mailmessage") # reformime list all mime sections

cat <"$mailmessage" # Re-inject the message to stdout for further processing

Problem: the script saves my file name.pdf to "attachements/20191025_[unknown character set: ANSI_X3.4-1968].pdf". So some how reformime does not recognizes the Content-Name: header content. Can this be due to a bad maildrop version? How can I solve this problem?

Offline ReetP

  • *
  • 2,188
Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #1 on: October 31, 2019, 01:06:07 PM »
I have no idea specifically but having a generic search for the error "unknown character set: ANSI_X3.4-1968" brings up some results you might want to read through.

However, reading the code suggests you may hit this problem so on the basis of the above search and this I would check your locale settings first.

Quote
# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C

Check this for starters.

Code: [Select]
locale -v
I would also imagine that without full access to the original mail we would have a struggle to replicate this for testing.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #2 on: November 01, 2019, 02:00:13 PM »
Thanks. Output of
Code: [Select]
locale -v
Code: [Select]
LANG=nl_NL.UTF-8
LC_CTYPE="nl_NL.UTF-8"
LC_NUMERIC="nl_NL.UTF-8"
LC_TIME="nl_NL.UTF-8"
LC_COLLATE="nl_NL.UTF-8"
LC_MONETARY="nl_NL.UTF-8"
LC_MESSAGES="nl_NL.UTF-8"
LC_PAPER="nl_NL.UTF-8"
LC_NAME="nl_NL.UTF-8"
LC_ADDRESS="nl_NL.UTF-8"
LC_TELEPHONE="nl_NL.UTF-8"
LC_MEASUREMENT="nl_NL.UTF-8"
LC_IDENTIFICATION="nl_NL.UTF-8"
LC_ALL=

I added
Code: [Select]
export LC_ALL=C LANG=C LANGUAGE=C to my script because the script didn't work. Somebody suggests that I have a bad maildrop version, see related mail-filter maildrop bug: bugs.gentoo.org/304093.  My maildrop version seems to be 2.5.0. How can I upgrade this package to maybe maildrop 2.9.3?
« Last Edit: November 01, 2019, 02:03:26 PM by joost »

Offline ReetP

  • *
  • 2,188
Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #3 on: November 01, 2019, 02:36:58 PM »
With SME v9.x you should be on 2.5.x

rpm -qa |grep maildrop

maildrop-2.5.0-13.el6.x86_64
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #4 on: November 03, 2019, 09:34:32 PM »
Yes, indeed. The package I have is
Quote
maildrop-2.5.0-13.el6.x86_64
. Somebody is suggesting that the scipt works, but that they use maildrop 2.9.3. Is there a way to upgrade this?

Offline ReetP

  • *
  • 2,188
Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #5 on: November 03, 2019, 10:07:11 PM »
The link you posted to the bug said this issue was fixed in 2.4.1 so your version should be ok.

To get a newer version you'd have to build your own package.

And even then it may not actually solve your issue.

There are other comments out there eg

https://confluence.atlassian.com/confkb/filesystem-encoding-is-written-as-ansi_x3-4-1968-even-though-the-server-is-set-to-utf-8-658735809.html

You might need to try a bit more investigation.

You also might try and debug what is going on in your script.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation

Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #6 on: November 30, 2019, 11:41:15 PM »
If I try the debug the script by adding:

Code: [Select]
# DEBUG<br>
exec 5> debug.txt
BASH_XTRACEFD="5"
PS4='$LINENO: '
set -x

to the script.

The output is:

Code: [Select]
17: export LC_ALL=C LANG=C LANGUAGE=C
17: LC_ALL=C
17: LANG=C
17: LANGUAGE=C
20: attachements=/home/e-smith/files/users/pdf/home/
22: trap 'rm -f -- "$mailmessage"' EXIT
225: mktemp
25: mailmessage=/tmp/tmp.yFIFzj3M1e
28: cat
31: read -r mime_section
558: reformime
334: reformime -s 1 -i
34: section_info='section: 1
content-type: multipart/mixed
content-transfer-encoding: 8bit
charset: ISO-8859-1
content-language: nl-NL
starting-pos: 0
starting-pos-body: 966
ending-pos: 11065
line-count: 170
body-line-count: 150'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=multipart/mixed
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ multipart/mixed = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.1 -i
34: section_info='section: 1.1
content-type: text/plain
content-transfer-encoding: 7bit
charset: utf-8
starting-pos: 1050
starting-pos-body: 1138
ending-pos: 1146
line-count: 6
body-line-count: 3'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=text/plain
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ text/plain = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.2 -i
34: section_info='section: 1.2
content-type: application/pdf
content-name: test.pdf
content-transfer-encoding: base64
charset: ISO-8859-1
content-disposition: attachment
content-disposition-filename: test.pdf
starting-pos: 1186
starting-pos-body: 1323
ending-pos: 11023
line-count: 138
body-line-count: 132'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=application/pdf
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=test.pdf
443: reformime -h test.pdf
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ application/pdf = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
47: '[' -z '[unknown character set: ANSI_X3.4-1968]' ']'
552: date +%Y%m%d
52: filename='20191130_[unknown character set: ANSI_X3.4-1968]'
55: reformime -s 1.2 -e
31: read -r mime_section
60: cat
1: rm -f -- /tmp/tmp.yFIFzj3M1e

It seems like something goes wrong here:   

Code: [Select]
# Decode the value of the Content-Name header
  content_name="$(reformime -h "$content_name")"

I don't understand this. Can you help me?

Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #7 on: December 06, 2019, 09:37:58 PM »
Solved it by adding the
Code: [Select]
-c option to refomime. Thus resulting in:
Code: [Select]
content_name="$(reformime -c UTF-8 -h "$content_name")" source: sourceforge.net/p/courier/mailman/message/24972857

Offline TerryF

  • grumpy old man
  • *
  • 1,122
Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #8 on: December 06, 2019, 11:49:43 PM »
Winner :-)
--
qui scribit bis legit

Offline ReetP

  • *
  • 2,188
Re: Problem Reformime and ANSI_X3.4-1968?
« Reply #9 on: December 07, 2019, 02:25:43 AM »
Nice find. Well done!!
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation