Koozali.org formerly Contribs.org

Contribs.org Forums => SME Server 9.x => Topic started by: joost on October 31, 2019, 12:10:42 PM

Title: Problem Reformime and ANSI_X3.4-1968?
Post by: joost on October 31, 2019, 12:10:42 PM
I'm trying to save an email attachment automatically to map with qmail and reformime. I use the following script:

Code: [Select]
#!/usr/bin/env bash

# This script process mail message attachments from stdin MIME message
# Extract all PDF files attachments
# and return the MIME message to stdout for further processing

# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C

# Setting the destination path for saved attachments
attachements='/home/e-smith/files/users/user/attachements/'

trap 'rm -f -- "$mailmessage"' EXIT # Purge temporary mail message

# Create a temporary message file
mailmessage="$(mktemp)"

# Save stdin message to tempfile
cat > "$mailmessage"

# Iterate all MIME sections from the message
while read -r mime_section; do

  # Get all section info headers
  section_info="$(reformime -s "$mime_section" -i <"$mailmessage")"

  # Parse the Content-Type header
  content_type="$(grep 'content-type' <<<"$section_info" | cut -d ' ' -f 2-)"

  # Parse the Content-Name header (if available)
  content_name="$(grep 'content-name' <<<"$section_info" | cut -d ' ' -f 2-)"

  # Decode the value of the Content-Name header
  content_name="$(reformime -h "$content_name")"

  if [[ $content_type = "application/pdf" || $content_name =~ .*\.[pP][dD][fF] ]]; then
    # Attachment is a PDF
    if [ -z "$content_name" ]; then
      # The attachment has no name, so create a random name
      content_name="$(mktemp --dry-run unnamed_XXXXXXXX.pdf)"
    fi
    # Prepend the date to the attachment filename
    filename="$(date +%Y%m%d)_$content_name"

    # Save the attachment to a file
    reformime -s "$mime_section" -e <"$mailmessage" >"$attachements/$filename"
  fi

done < <(reformime < "$mailmessage") # reformime list all mime sections

cat <"$mailmessage" # Re-inject the message to stdout for further processing

Problem: the script saves my file name.pdf to "attachements/20191025_[unknown character set: ANSI_X3.4-1968].pdf". So some how reformime does not recognizes the Content-Name: header content. Can this be due to a bad maildrop version? How can I solve this problem?
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: ReetP on October 31, 2019, 01:06:07 PM
I have no idea specifically but having a generic search for the error "unknown character set: ANSI_X3.4-1968" brings up some results you might want to read through.

However, reading the code suggests you may hit this problem so on the basis of the above search and this I would check your locale settings first.

Quote
# Ensure all locale settings are set to C, to prevent
# reformime from failing MIME headers decode with
# [unknown character set: ANSI_X3.4-1968]
# See: https://bugs.gentoo.org/304093
export LC_ALL=C LANG=C LANGUAGE=C

Check this for starters.

Code: [Select]
locale -v
I would also imagine that without full access to the original mail we would have a struggle to replicate this for testing.
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: joost on November 01, 2019, 02:00:13 PM
Thanks. Output of
Code: [Select]
locale -v
Code: [Select]
LANG=nl_NL.UTF-8
LC_CTYPE="nl_NL.UTF-8"
LC_NUMERIC="nl_NL.UTF-8"
LC_TIME="nl_NL.UTF-8"
LC_COLLATE="nl_NL.UTF-8"
LC_MONETARY="nl_NL.UTF-8"
LC_MESSAGES="nl_NL.UTF-8"
LC_PAPER="nl_NL.UTF-8"
LC_NAME="nl_NL.UTF-8"
LC_ADDRESS="nl_NL.UTF-8"
LC_TELEPHONE="nl_NL.UTF-8"
LC_MEASUREMENT="nl_NL.UTF-8"
LC_IDENTIFICATION="nl_NL.UTF-8"
LC_ALL=

I added
Code: [Select]
export LC_ALL=C LANG=C LANGUAGE=C to my script because the script didn't work. Somebody suggests that I have a bad maildrop version, see related mail-filter maildrop bug: bugs.gentoo.org/304093.  My maildrop version seems to be 2.5.0. How can I upgrade this package to maybe maildrop 2.9.3?
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: ReetP on November 01, 2019, 02:36:58 PM
With SME v9.x you should be on 2.5.x

rpm -qa |grep maildrop

maildrop-2.5.0-13.el6.x86_64
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: joost on November 03, 2019, 09:34:32 PM
Yes, indeed. The package I have is
Quote
maildrop-2.5.0-13.el6.x86_64
. Somebody is suggesting that the scipt works, but that they use maildrop 2.9.3. Is there a way to upgrade this?
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: ReetP on November 03, 2019, 10:07:11 PM
The link you posted to the bug said this issue was fixed in 2.4.1 so your version should be ok.

To get a newer version you'd have to build your own package.

And even then it may not actually solve your issue.

There are other comments out there eg

https://confluence.atlassian.com/confkb/filesystem-encoding-is-written-as-ansi_x3-4-1968-even-though-the-server-is-set-to-utf-8-658735809.html

You might need to try a bit more investigation.

You also might try and debug what is going on in your script.
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: joost on November 30, 2019, 11:41:15 PM
If I try the debug the script by adding:

Code: [Select]
# DEBUG<br>
exec 5> debug.txt
BASH_XTRACEFD="5"
PS4='$LINENO: '
set -x

to the script.

The output is:

Code: [Select]
17: export LC_ALL=C LANG=C LANGUAGE=C
17: LC_ALL=C
17: LANG=C
17: LANGUAGE=C
20: attachements=/home/e-smith/files/users/pdf/home/
22: trap 'rm -f -- "$mailmessage"' EXIT
225: mktemp
25: mailmessage=/tmp/tmp.yFIFzj3M1e
28: cat
31: read -r mime_section
558: reformime
334: reformime -s 1 -i
34: section_info='section: 1
content-type: multipart/mixed
content-transfer-encoding: 8bit
charset: ISO-8859-1
content-language: nl-NL
starting-pos: 0
starting-pos-body: 966
ending-pos: 11065
line-count: 170
body-line-count: 150'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=multipart/mixed
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ multipart/mixed = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.1 -i
34: section_info='section: 1.1
content-type: text/plain
content-transfer-encoding: 7bit
charset: utf-8
starting-pos: 1050
starting-pos-body: 1138
ending-pos: 1146
line-count: 6
body-line-count: 3'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=text/plain
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=
443: reformime -h ''
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ text/plain = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
45: [[ [unknown character set: ANSI_X3.4-1968] =~ .*\.[pP][dD][fF] ]]
31: read -r mime_section
334: reformime -s 1.2 -i
34: section_info='section: 1.2
content-type: application/pdf
content-name: test.pdf
content-transfer-encoding: base64
charset: ISO-8859-1
content-disposition: attachment
content-disposition-filename: test.pdf
starting-pos: 1186
starting-pos-body: 1323
ending-pos: 11023
line-count: 138
body-line-count: 132'
337: cut -d ' ' -f 2-
337: grep content-type
37: content_type=application/pdf
440: cut -d ' ' -f 2-
440: grep content-name
40: content_name=test.pdf
443: reformime -h test.pdf
43: content_name='[unknown character set: ANSI_X3.4-1968]'
45: [[ application/pdf = \a\p\p\l\i\c\a\t\i\o\n\/\p\d\f ]]
47: '[' -z '[unknown character set: ANSI_X3.4-1968]' ']'
552: date +%Y%m%d
52: filename='20191130_[unknown character set: ANSI_X3.4-1968]'
55: reformime -s 1.2 -e
31: read -r mime_section
60: cat
1: rm -f -- /tmp/tmp.yFIFzj3M1e

It seems like something goes wrong here:   

Code: [Select]
# Decode the value of the Content-Name header
  content_name="$(reformime -h "$content_name")"

I don't understand this. Can you help me?
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: joost on December 06, 2019, 09:37:58 PM
Solved it by adding the
Code: [Select]
-c option to refomime. Thus resulting in:
Code: [Select]
content_name="$(reformime -c UTF-8 -h "$content_name")" source: sourceforge.net/p/courier/mailman/message/24972857
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: TerryF on December 06, 2019, 11:49:43 PM
Winner :-)
Title: Re: Problem Reformime and ANSI_X3.4-1968?
Post by: ReetP on December 07, 2019, 02:25:43 AM
Nice find. Well done!!