Koozali.org: home of the SME Server

a script to visual analze emailheaders using country geoip

Offline purvis

  • ****
  • 567
  • +0/-0
a script to visual analze emailheaders using country geoip
« on: April 12, 2013, 09:34:33 AM »
This program was written to help me analyze how i might want to handle bad email by determining the country origin using the geoip lookup software that is easily installed following a wiki page from contribs.org.

This program help me see where email might be coming from. It will not be perfect for a variety of reason but it is a tool just the same.
Also this program can be use to test if email might be coming into the server from an unwanted country.
I am sure it can use it for other purposes too.
There is also a function that i added  to see where emails marked with a certain spam level using astericks placed the email headers, presumably by spammassain, may be coming from.

In the future, i will not be allowing any email from countries outside of the usa to accounts on our servers as a general rule. But i will allow certain mail to the maillog account.
This program was more than i needed by i tried to make it for other users. My learning curve of bash and regular expressions has been slow down.
This script does not delete emails as written.

From what I have seen with some other addon programs for sme server, you have to exclude countries. That could be ok for somebody wanting that. But I need to filter in and not out emails within countries I want.

This script makes use of the GeoIp program found here http://wiki.contribs.org/GeoIP
You can find most of the countries listed there also.
The program has to be loaded on your sever for this script to run as is.
The script was written with as few functions as possible to keep it fast but also trying to keep it somewhat modular for changing.
There are some areas where i have commented out lines while testing. I have limited time to do this kind of stuff and it earns me zero dollars except for trouble the script may save me in down time in the future.

I will share another script in about a month that will be derived from this and it will be much leaner to process received emails full time as they arrive.

This script was set up to view emails headers that are received from outside of the United States. Any outside ip address for the countries wanted in the email header flags the email I the script as rejected.

I had a thought about placing countries into the email header of all received emails. That might prove useful somewhere down the road. But I am learning to crawl before I run.
Code: [Select]
#!/bin/bash


### This program was written to help evaluate received emails to determine
###   if emails need to be blocked by ip addresses in the email headers or
###   to see if emails are being received for countries not wanted.
### There are other uses for this program one might want to use it for.
### This program makes use of a lot of functions to make it somewhat easier
###    to alter to get other results
### This program does not alter emails as written  and the users takes full use for its actions.
 

### This script will search emails that have emails where a ip address does not
###  come from a safe list placed into a string.

### There is another option to search only on the spam asterick marks
### In the "X-Spam-Level:" email header.
### Those are compared to how may astericks are in the field.
### You set how many minuium astericks to have in the fiel.
### Any number or higer number of astericks will display the email
### spamastericks must be set to 0(zero) for script to work on ip addresses alon


###---------------------------------------------------------------------start of variables
 
###  All countries need a space before and after the country and
###     a comma immediately after the country abbreviation.
###  Between each country needs to be a backslash and a bar together
 
     safelistofcountries=" US, \| UM, \| GOV, \| MIL, "

### edit the wanipaddressofthisemailserver
### place you ip address in so that it will not be searched
### this is not necessary but may prove helpful.
### the program will not report on the ipaddresses in this string
### place a minus sign n front of each ip 4 address with no spaces in the string
###  an example is   "-78.152.2.1-645.1.284.291"

     wanipaddressofthisemailserver=""

### Set nosafecountryduplicate to 1 to not record a second ip lookup of an address
###     that was already found in a previous email that was geoip looked up if that ip
###     address fell into one of the safe countries list.
### All addresses outside of the safe list will be shown and reocrded in the log files
### The setting of 0(zero) will record all ip addresses. This feature was designed to
###     speed up the program when set to 1, but you could make good use of the log
###     when you record all ip addresses.
###     Any ip address will be listed only once for each email header scanned.
       
       nosafecountryduplicate=1

### set readmaillogemails to 0 to exclude the emails to the  maillog account
### set readmaillogemails to 1 to exclude the emails that are not in the maillog account
### set readmaillogemails to 2 to read all emails, both maillog and non maillog accounts
### if you are reading a directory full of email flles not under the MailDir directory set this value to 2
#readmaillogemails=0    # exclude emails under maillog account
#readmaillogemails=1    # read only the emails under the maillog account
#readmaillogemails=2    # read all emails

   readmaillogemails=2    # read all emails

### set spamastericks to the number of how many spam astericks need to to be found in the spam header line

   spamastericks=0

### assign the program to lookup geo ip address based on countries
    geoiplookupcommand="/usr/bin/geoiplookup"
###---------------------------------------------------------------------end of variables


###---------------------------------------------------------------- start of functions

function whereisipfrom {
   local stemp=""
   iplocationaccept=0
   local tempcount=0
   
   if [ $nosafecountryduplicate -eq 1 ];then
       tempcount=$(echo $sipacceptlist | grep  -c  "\-$1\-")
       if [ $tempcount -gt 0 ]
          then
          iplocationaccept=1
          return
   fi
   fi

   stemp=$($geoiplookupcommand "$1" | sed  -e 's/.*://')
 # stemp=$(echo $stemp | tr '[:upper:]' '[:lower:]')
   sgroupedcountries="$sgroupedcountries$1 $stemp\n"
   echo -e "$1\t$stemp" >> /tmp/geoemailall
   tempcount=$(echo $stemp | grep  -i -c "$safelistofcountries")
   if [ $tempcount -gt 0 ]
       then
       iplocationaccept=1
       sipacceptlist="$sipacceptlist$1-"
       echo -e "$1\t$stemp" >> /tmp/geoemailaccepted
       return
   fi
   if [ $iplocationrejected -eq 0 ];then clear;fi
   
   tempcount=$(echo $stemp | grep  -i -c " not found")
   if [ $tempcount -gt 0 ];then iplocationaccept=-1;fi

   echo "$1 $stemp"

   echo -e "$1\t$stemp" >> /tmp/geoemailrejected
   if [ "$iplocationaccept" -eq 0 ] || [ "$iplocationaccept" -eq -1 ]
      then
      iplocationrejected=2
      ipthisreceivedlinerejected=1
   fi
return
}

function checkipaddresses  {
   local tempcount=0
   local stemp=""
   local kk=""
   ipthisreceivedlinerejected=0
   set $1
   for item
    do
        tempcount=$(echo "-$item" | grep -c  "\-127.0.0.1\|\-192.168.\|\-10.\|\-169.254.")
        if [ "$tempcount" -gt 0 ];then continue;fi
        tempcount=$(echo "-$item" | grep -c  "\-172.")
         if [ "$tempcount" -gt 0 ];then
            tempcount=$(echo "-$item" | grep -c  "\-172.16.\|\-172.17.\|\-172.18.\|\-172.19.\|\-172.20.\|\-172.21.\|\-172.22.\|\-172.23.\|\-172.24.\|\-172.25.\|\-172.26.\|\-172.27.\|\-172.28.\|\-172.29.\|\-172.30.\|\-172.31.")
            if [ "$tempcount" -gt 0 ];then continue;fi
         fi
        tempcount=$(echo $ipaddresseschecked | grep  -c "\-$item\-")
        if [ "$tempcount" -gt 0 ];then continue;fi
        ipaddresseschecked="$ipaddresseschecked$item-"
        whereisipfrom "$item"
    done
return
}

function locateipaddresses {
   local stemp=""
   local ipaddresses=""
   iplocationrejected=0
   ipaddresseschecked="-127.0.0.1-$wanipaddressofthisemailserver-"
   for item in $semailheader
   do
         stemp=$(echo "$item" | grep  -i  "^Received:" )
         if  [ ! -z "$stemp" ]
             then
             ipaddresses=$(echo "$stemp" | grep -E -o '((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])')
             #echo $ipaddresses
             if  [ ! -z "$ipaddresses" ]
                 then
                 checkipaddresses "$ipaddresses"
                 #the next line is testing, where you can view the recevied line in the email header where the rejected ip address is found
                 if [ $ipthisreceivedlinerejected -gt 0 ];then echo "$stemp";echo " ";fi
             fi
         fi
      done
return
}

function isemailspamflagged {
   local tempcount=0
   local stemp=""
   flaggedwithspam=0
   for item in $semailheader
   do
       stemp=$(echo "$item" | grep  -i  "^X-Spam-Level:" )
       if  [ ! -z "$stemp" ];then
           tempcount=$(echo "$stemp" | grep  -c "$sspamstring" )
           if  [ $tempcount -gt 0 ];then  flaggedwithspam=1;fi
         fi
   done
return
}

function echoemailheaderline {
   for item in $semailheader
   do
    echo "$item" | grep  -i  "$1"
   done
return
}
function getemailheaderfromemailfile {
semailheader=$(grep -B10000 -m1 ^$ $filename | tr -s '\t' '\040'|sed -e ':a;N;$!ba;s/\n / /g')
return
}
###----------------------------------------------------------------------end of functions






###   -----------------------------------------------------------------------------------------------
###   main program starts hear
###



 

if [ ! -f "$geoiplookupcommand" ]
   then
   echo "the program below does not exist and is needed"
   echo "$geoiplookupcommand"
fi

rm -f /tmp/geoemailall
rm -f /tmp/geoemailaccepted
rm -f /tmp/geoemailrejected
echo "all email geoipaddress " > /tmp/geoemailall
echo "" >> /tmp/geoemailall

echo "all email geoipaddress accepted" > /tmp/geoemailaccepted
echo "" >> /tmp/geoemailaccepted

echo "all email geoipaddress rejected " > /tmp/geoemailrejected
echo "" > /tmp/geoemailrejected

   counter=0
   while [ $counter -lt $spamastericks ]
         do sspamstring="$sspamstring\*"
         let counter=counter+1
   done

clear
echo "working ! readng emails ......................"
echo ""
echo ""

countfile=0
sipacceptlist="-"
IFS=$(echo -en "\n\b")

for filename in $(find /home/e-smith/files/users/*/Maildir/ -name "1*.*.$HOSTNAME*" -type f )
do

   if [ "$readmaillogemails" -eq 0 ];then
      if [[ $filename == */maillog/Maildir/* ]];then  continue; fi
   fi

   if [ "$readmaillogemails" -eq 1 ];then
      if [[ $filename != */maillog/Maildir/* ]];then  continue; fi
   fi
   let countfile+=1
   #echo $countfile
   #echo $filename
   getemailheaderfromemailfile "$filename"
   #echo -e "$semailheader"
   sgroupedcountries=""
   iplocationrejected=0
   locateipaddresses

   flaggedwithspam=0
   if [ $spamastericks -gt 0 ]
      then
      iplocationrejected=0
      isemailspamflagged
   fi

  showemail=0
  if [ $flaggedwithspam -gt 0 ];then showemail=1; fi
  if [ "$iplocationrejected" -gt 0 ] && [ "$spamastericks" -eq 0 ];then showemail=1;fi

  if [ $showemail -gt 0 ]
      then
      echoemailheaderline ":"
      echo ""
      echo "-------------------------------------------"
      echo -e "$sgroupedcountries"
      echoemailheaderline "^Date: "
      echoemailheaderline "^TO: "
      echoemailheaderline "^From: "
      echoemailheaderline "^Return"
      echoemailheaderline "^Subject: "
      echoemailheaderline "^X-"
      echoemailheaderline "^RECEIVE"
      echo "filename $filename"
      echo "--------------------------------------------------------------------------------------------------"
      read -p "Press [Enter] key to continue ......."
      echo "working ! readng emails ......................"
      echo  ""
      echo ""
   fi
done


echo ""
echo ""
echo "total emails read $countfile"
echo ""
echo "next! view 3 files with only geoiplookup returned iinformation"
echo "view geoip counties in the file /tmp/geoemailall"
echo "view geoip counties in the file /tmp/geoemailaccepted"
echo "view geoip counties in the file /tmp/geoemailrejected"
echo ""
echo "  To quit viewing each list, press the q key to quit"
echo "  You will have to delete the 3 files above manually"
echo "   when are complete and done using this program."
echo "   They are left undeleted so you can use them in other ways."
read -p "Press [Enter] key to continue ......."
cat /tmp/geoemailall | less
cat /tmp/geoemailaccepted | less
cat /tmp/geoemailrejected | less
exit 0
« Last Edit: April 18, 2013, 09:12:49 AM by purvis »

Offline Stefano

  • *
  • 10,836
  • +2/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #1 on: April 12, 2013, 10:40:09 AM »
be aware that a USA company could have its servers in mongolia..

blocking email with GeoIP could be dangerous

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #2 on: April 12, 2013, 05:33:13 PM »
be aware that a USA company could have its servers in mongolia..

blocking email with GeoIP could be dangerous

Moreover, you won't solve the spam problem by allowing email only from the USA. 20-25% of global SPAM originates from USA IP addresses. Only China produces more.

http://www.securelist.com/en/analysis/204792282/Spam_in_January_2013
http://www.securelist.com/en/analysis/204792251/Spam_in_Q3_2012

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #3 on: April 12, 2013, 10:27:53 PM »
Stefano. You may have to whitelist some ip addresses.
I suggest using email server that is more likely to a scan emails better for those outside companies.
I don't think having personal mail on a business server is appropriate.
Email belongs to the server owner. Lets not start a discussion on that.
Charlie. I do acknowledge that spam can come from USA. If I can cut 80 to 75 percent out. That is ok with me.

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #4 on: April 12, 2013, 10:44:56 PM »
I am reviewing my email to get a view of what is coming in and how it is rated on spam.
I do not have a lot of email coming in to really do a top notch job of review.
But I think that I can make some remarks.

It appears now that mostly the emails containing vicious material is marked with an out of USA IP address in some way. Most of it will be the originators up address.

I have seen a lot of unwanted email either vicious or spam have a email header that
PHPmailer from sourceforge.net is used.

The bad email from my brothers yahoo.com account that somehow was compromised was spamming email to his contact list and some if not all his contact list in yahoo were in the email headers.   The originator was from of the bad email was from outside the USA and the multiple (4)times this kind of email came was with originators outside USA in different countries each time.
Only the last email was marked as spam. This started over a year ago and all email was to vicious single links.

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #5 on: April 12, 2013, 10:54:03 PM »
I would think spammers would want to use a country that is not in the same country receiving the spammed email.
Right now I was cheated out of over 500 dollars from a source in England.
Do you think the police or I am going to tract that down.
I would cost to much.
Do countries make laws to safe guard other countries or even care much.
I do not think so.

These are reasons I don't trust emails coming from outside ones country.
I am sure as global the worlds economy has gotten. There is a need. But not all is global when it comes to business.

Offline Stefano

  • *
  • 10,836
  • +2/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #6 on: April 12, 2013, 10:59:35 PM »
purvis, let me say you have a strange idea of spammers :-)

spammers use many way to send spam..

feel free to do what you prefer/want, but using GeoIp to fight spam at smtp level is not a smart idea.. you should instead use it with spamassassin..

my 2€c

Offline hawkinstw

  • 2
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #7 on: April 13, 2013, 04:41:02 AM »
According to av-test.org USA is currently actually the biggest source of SPAM at 14.9% followed by Argentina at 6.4%  China is way down at number 6 with 4%

http://www.av-test.org/en/statistics/spam/

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #8 on: April 18, 2013, 09:10:58 AM »
The above script needs a slight change.
In the script, i did not want to do a geo ip lookup of computer behind a nat device.
I had forgot about self assigned ip address ranges. I did see a received email that was originated on a computer with
a self assigned ip address that came from a .edu domain.
This altered line of code should cause a self assigned ip address to NOT be looked up.

My intention is not to lookup any private network ip addresses.
http://en.wikipedia.org/wiki/Private_network

remove this code in the above script
Code: [Select]
   tempcount=$(echo "-$item" | grep -c  "\-127.0.0.1\|\-192.168.\|\-10.\|\-169.254.")
        if [ "$tempcount" -gt 0 ];then continue;fi
        tempcount=$(echo "-$item" | grep -c  "\-172.")
         if [ "$tempcount" -gt 0 ];then
            tempcount=$(echo "-$item" | grep -c  "\-172.16.\|\-172.17.\|\-172.18.\|\-172.19.\|\-172.20.\|\-172.21.\|\-172.22.\|\-172.23.\|\-172.24.\|\-172.25.\|\-172.26.\|\-172.27.\|\-172.28.\|\-172.29.\|\-172.30.\|\-172.31.")
            if [ "$tempcount" -gt 0 ];then continue;fi
         fi

and replace it with this code
Code: [Select]
       tempcount=$(echo "-$item" | LC_ALL=C grep -c  "\-127\.0\.0\.1\|\-192\.168\.\|\-10\.\|\-169\.254\.")
       if [ $tempcount -gt 0 ];then continue;fi
       tempcount=$(echo "-$item" |LC_ALL=C grep -c  "\-172\.1[6-9]\.\|\-172\.2[0-9]\.\|\-172\.3[0-1]\.")
       if [ $tempcount -gt 0 ];then continue;fi
« Last Edit: April 19, 2013, 07:32:22 PM by purvis »

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #9 on: April 18, 2013, 03:22:43 PM »
You have left 172.16.a.b through 172.31.x.y from your list.  See RFC1918.

Offline Knuddi

  • *
  • 540
  • +0/-0
    • http://www.scanmailx.com
Re: a script to visual analze emailheaders using country geoip
« Reply #10 on: April 18, 2013, 07:49:10 PM »
Just an interesting image of the countries that produce spam based on the volume I see at ScanMailX:

https://www.scanmailx.com/index.php?option=com_content&view=article&id=6&Itemid=34&lang=en

This is updated hourly and yes, USA is often scoring quite high :-)

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #11 on: April 19, 2013, 12:33:03 AM »
charlie
there are ALREADY lines of code to catch those you mentioned.
here are all the lines that help to remove private ip addresses.

Code: [Select]
tempcount=$(echo "-$item" | LC_ALL=C grep -c  "\-127\.0\.0\.1\|\-192\.168\.\|\-10\.\|\-169\.254\.")
       if [ $tempcount -gt 0 ];then continue;fi
           tempcount=$(echo "-$item" |LC_ALL=C grep -c  "\-172\.1[6-9]\.\|\-172\.2[0-9]\.\|\-172\.3[0-1]\.")
           if [ $tempcount -gt 0 ];then continue;fi

There other ip address ranges that I am not sure about that need culling out as well listed here.
http://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_address_blocks

Originally my code checked for numeric values of ip addresses and that meant the ip addresses had to be converted into numeric values which also required more somewhat complicated computational formulas and comparisons. I did not do timing test to see what was faster. Grep matching vs value comparison. But Grep looked easier to view and edit from a programming point of view to me.

I will do some further testing of these other ip addresses listed in the above web page and if i find anything worthy. I will not hesitate to post quickly.
« Last Edit: April 19, 2013, 07:27:44 PM by purvis »

Offline Stefano

  • *
  • 10,836
  • +2/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #12 on: April 19, 2013, 03:56:05 PM »
purvis, thank you for you effort, but I'd suggest you to use perl.. it has some modules that can help you and, most important, you don't have to reinvent the wheel

google -> "perl mail header parser" ;-)

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #13 on: April 19, 2013, 08:25:24 PM »
The above script was designed to do the job but not be efficient, just get the job done of recognizing unwanted countries.
There are going to only be two more scripts. They are both written to reduce the fat, functions, and increase speed where it could find it.

I think i identified 2 places of code where code might can be improved in the script code as it is now.
One is creating the semailheader string where the email header lines are created without line breaks(wraps). Every email header lines is suppose to be somewhere around 72 characters and that means most all email headers include some header lines that where wrapped. If sed could be used as the only tool to retrieve the email file and arrange the header with one sed command and multiple actions. That one line would increase the speed a lot.

The other is speeding up grep. I am making use of the  LC_ALL for good or bad to speed grep up  some. How this affects other locales, i do not know. You can remove the "LC_ALL" where it is located if you like.


This script is more of an no hand holding script that brings all the code above to one simple function and removes as much variable passing as i get out with my experience and and modifies lines of code that hopefully increase speed.
There is no perl here. I wanted something most all people can learn to do and modify with basic linux utilities tools found on most all linux machines.

On my older machine, single processor and slow single drive, this routine processes the same email that are short in length but have a substantial length header of about 28.5 per second or 1710 per minute processing 10,000 of the email files that are identical and the directory holds 100,000 of those email files.
But this script is not really the final desired result. This script just provides for a tool where somebody can scan some emails and add or take away from the code in a simple programming fashion. Most anybody should be able to remove code and add code to get their desired results wanted.

Not this script but the next script i will post will hopefully be the final one that is will become a script to check a single email file for whether it has a ip address from an unwanted country or not. We will see where that goes when it gets created today, but here is this one. I am all about speed so if there is noticeable speed up inside the function in the script. I am all ears. No perl in this code please. Actually the next script will have  a near exact function and that function is where any speed up needs to be using basic linux utilities. As you can tell this bash script code is much more compact than the above.
Code: [Select]
#!/bin/bash

safelistofcountries=" US, \| UM, \| GOV, \| MIL, "
wanipaddressofthisemailserver=""

function lookforrejectedcountry {
   local iplocationrejected=0
   local tempcount=0
   local ipaddresses=""
   local ipchecked=""
   local semailheader=""
   local filename=""
   local item=""
    filename=$1
    ipchecked="-127.0.0.1-$wanipaddressofthisemailserver-"
    semailheader=$(sed -e '/^$/q' $filename | tr -s '\t' '\040'| sed -e ':a;N;$!ba;s/\n / /g')
    ipaddresses=$(echo "$semailheader" | LC_ALL=C grep "Received:" | LC_ALL=C grep -E -o '((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])')
    if [ -z "$ipaddresses" ];then return 0;fi
    set $ipaddresses
    for item
    do
       tempcount=$(echo "$ipchecked" | LC_ALL=C grep  -c "\-$item")
       if [ $tempcount -gt 0 ];then continue;fi
       ipchecked="$ipchecked$item-"
       tempcount=$(echo "-$item" | LC_ALL=C grep -c  "\-127\.0\.0\.1\|\-192\.168\.\|\-10\.\|\-169\.254\.")
       if [ $tempcount -gt 0 ];then continue;fi
           tempcount=$(echo "-$item" |LC_ALL=C grep -c  "\-172\.1[6-9]\.\|\-172\.2[0-9]\.\|\-172\.3[0-1]\.")
           if [ $tempcount -gt 0 ];then continue;fi
        tempcount=$(geoiplookup "$item" | sed  -e 's/.*://' | LC_ALL=C grep  -i -c "$safelistofcountries")
        if [ $tempcount -eq 0 ]
           then
           iplocationrejected=1
          stemp=$(geoiplookup "$item"); echo "rejected $item $stemp"
        fi
   done
return $iplocationrejected
}


###   -----------------------------------------------------------------------------------------------
###   main program starts hear
###

/usr/bin/renice 19 -p $$ > /dev/null
/usr/bin/ionice -c3 -n7 -p $$ > /dev/null

echo "working ! readng emails ......................"
echo ""

countfile=0
IFS=$(echo -en "\n\b")
for filename in $(find /home/e-smith/files/users/*/Maildir/*/ -name "1*" -type f )
do
#  echo $filename
  lookforrejectedcountry "$filename"
  if [ $? = 1 ];then echo "$filename";echo "rejected because of country";echo "";fi
   let countfile+=1
done
echo ""
echo "total emails read $countfile"
echo ""
exit 0
« Last Edit: April 19, 2013, 10:23:16 PM by purvis »

Offline purvis

  • ****
  • 567
  • +0/-0
Re: a script to visual analze emailheaders using country geoip
« Reply #14 on: April 20, 2013, 06:05:32 PM »
This is pretty much getting to my final work on the finding ip addresses in emails that are not from accepted countries.
There was  alot of work and testing put into this baby to make it fly in as a bash script.
I made it as flexible as possible for others  but still needed efficient code.
Basically the bash script is nothing more than a function that is written for another program to call.

I did make it where you could override the default country list inside the bash script by making it an optional parameter.
You call this program from another program by passing it a filename and list of country abbreviations in parenthesis.
Do not forget to pass those parameters enclosed in parenthesis.
ex:  bashscript "/home/e-smith/files/users/john doe/new/1abcde.eml" "US,GB,IT,"
Place a comma behind all country abbreviations, even the last one. Countries are matched on with the comma at the end.

This code runs much faster and I am sure there could be still some improvement.

There is a single "break" statement and it is use to exit the bash script upon the first country not in list of countries supplied to the script.
You can remove the "break" statement but it will add more time in processing  the full list of ip addresses that came from the Received: line of a email's files header.
The code is short in length
Code: [Select]

#!/bin/bash
   if [ ! -f "$1" ];then exit 255 ;fi
    ipaddresses=$(grep -B10000 -m1 ^$ $1 | formail -cX "" | LC_ALL=C grep "^Received:" |  \
        LC_ALL=C grep -E -o '((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])')
   ipaddresses=$(echo "$ipaddresses" | sort -r -u )
   if [ -z "$ipaddresses" ];then exit 254;fi
   listofcountries="US,UM,GOV,MIL,"
   if [ ! -z "$2" ];then listofcountries=$2;fi
   iplocationrejected=0
   for item in $ipaddresses
   do
     set ${item//./ }
     let "ipdecnumber=$1 * 256 ** 3 + $2 * 256 ** 2 + $3 * 256 + $4"
     case $1 in
       127) if  [ $ipdecnumber -eq 2130706433 ];then continue;fi 
         ;;
        10) if [[ $ipdecnumber -gt 167772159  &&   $ipdecnumber -lt 184549376 ]];then continue;fi 
         ;;
       169) if [[ $ipdecnumber -gt 2851995647  &&  $ipdecnumber -lt 2852061184 ]];then continue;fi
         ;;
       172) if [[ $ipdecnumber -gt 2886729727  &&  $ipdecnumber -lt 2887778304 ]];then continue;fi
         ;;
       192) if [[ $ipdecnumber -gt 3232235519  &&  $ipdecnumber -lt 3232301056 ]];then continue;fi
         ;;
     esac
     stemp="$(geoiplookup $ipdecnumber | sed  -e 's/.*: //g' -e 's/,.*//'),"
    if [ $(echo $listofcountries | grep -c "$stemp") -eq 0 ]
        then
       let iplocationrejected+=1
       break   # comment out the break to not  exit  on the first hit of a country not in the list
     fi
 done
exit $iplocationrejected

here is a calling code from another bash script program
i will leave it up to you to create a better routine in the following code such as a case clause
Code: [Select]
/test/emailrrr "/home/e-smith/files/ibays/data/files/emails/$filename" "US,GB,IT,"
retval=$?
if [ $retval -eq 0 ];then echo "no ip address found outside of countries wanted";fi
if [ $retval -gt 0 ];then
   if [ $retval -eq 255 ];then  echo "file not found $filename";fi
   if [ $retval -eq 254 ];then  echo "no ip addresses found in received headers";fi
   if [ $retval -lt 254 ];then echo "$retval ip address(s) outside countries range";fi
fi

« Last Edit: April 20, 2013, 06:11:34 PM by purvis »