Koozali.org: home of the SME Server

Denial of Service, poor hardware, or bad software settings?

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Denial of Service, poor hardware, or bad software settings?
« on: November 02, 2006, 06:34:18 PM »
I have one site that repeatedly shows the following symptoms:

1. Too many connections: 40 >= 40. Waiting one second
SMTP connections reach the configured maximum "Instances" for smtpd, resulting in a continuous stream of "Too many connections: 40 >= 40. Waiting one second" in /var/log/qpsmtpd/current.

2. Server Overload
The server response slows to a crawl at this time.

3. Duplicate Email
Many emails received while the number of connections is overloaded like this are received two, three, or even more times.  Examination of the headers shows that the messages have been re-sent by the sending email host several hours apart.  The senders get notices from their mail servers saying that their email couldn't be sent because our server disconnected unexpectedly, and that the email will be retried at a later time.  I believe this happens if qpsmtpd & spamd take too long to respond "OK" to the sending server - the sending server disconnects and decides to retry the transmission later.

4. Deny smtp connections using iptables:
The system stays in this overloaded condition until I scan /var/log/qpsmtpd/* for all hosts that have been denied by either 'dnsrbl' or 'check_earlytalker', add them to iptables with a DENY rule, then run 'service qpsmtpd restart'.  The system is usually so overloaded that I cannot use "smtpd DenyHosts" and "signal-event email-update" - the "email-update" never finishes (I've left it running for a couple hours).  I have figured out a bunch of grep / sed commands to let me manually add "deny" rules to the appropriate "Inbound_TCP_####" chain. (I know that these rules will go away if I reboot; that's OK)

--------------------------------------------------------------------

I believe that this behavior results from a denial of service attack against my server that uses a sort of reverse-tarpit scheme to connect to my server then hold the connection open using a TCP window size of 0 bytes, but I don't know how to prove this.  If I am correct this seems guaranteed to succeed as a DoS technique since qpsmtpd allows each host to connect before beginning to apply the various plugins -- unless there is a qpsmtpd plugin that can manage a timed database of hosts who are not even allowed to open TCP connections - a sort of rbl list that is tied to iptables instead of qpsmtpd.

Hardware
Dell Dimension L933r
256MB RAM
40GB IDE HDD

Software
SME 7 Final installed from scratch
Email forwarded to internal exchange server
smeserver-sme7admin-1.1.0-1.noarch.rpm
smeserver-spamassassin-features-0.0.2-0.noarch.rpm
smeserver-saco-qmHandle-1.3.1-1.noarch.rpm
smeserver-sarg-1.4.1-5.i386.rpm
smeserver-sysmon-5.0-3.noarch.rpm
sysstat-5.0.5-11.rhel4.i386.rpm
mondo-2.0.9-2.rhel4.i586.rpm

I plan to upgrade the RAM to 512MB to see if that has any effect, and would appreciate any insights from the community on what else I could do to resolve this issue.

- Michael

Offline JonB

  • *
  • 351
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #1 on: November 03, 2006, 01:47:04 AM »
Michael,

My server is also getting hammered at the moment, in bursts around an hour apart. I am not yet getting to the maximum number of smtp connections and I think the reason is that I am running a 3GHz processor and 1.5Gb of RAM so I can process the connections quickly.

What I am noticing though is that the number of DNS queries that the server is having to make is killing my upload bandwidth (512Kb) during the mail bursts.

There has been a huge increase in spam and virus emails in the past week and I am wondering if the DNSBL providers are getting hammered with DNS queries which in turn is slowing down their response time, hence connections are held until a response is received.

Do a tcpdump on port 53 and see what is happening.

I have been considering changing the order in which qpsmtpd calls the plugins. Currently the order is



Code: [Select]
auth/auth_cvm_unix_local
check_earlytalker
count_unrecognized_commands
# bcc disabled
check_relay
check_norelay
whitelist_soft
require_resolvable_fromhost
check_basicheaders
rhsbl
dnsbl
check_badmailfrom
check_badrcptto_patterns
check_badrcptto
check_spamhelo
check_goodrcptto extn -
rcpt_ok
virus/pattern_filter
tnef2mime
spamassassin
virus/clamav
queue/qmail-queue


We have done a lot of checking before we go to the DNSBL. I think the DNSBL should be near the beginning either before or after check_earlytalker. If the IP is on the DNSBL's we are using then it should be dropped straight away.

Jon
...

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #2 on: November 03, 2006, 04:30:32 AM »
I don't see much going on on port 53 -- but I'm not having any trouble right now, either.  I'll check it again the next time I get "hit".

I only have a 384K SDSL connection; if your 512 is saturated with DNS traffic, my 384K could be, too... Luckily, I have more bandwidth on the way!

I am seeing quite a few connections denied by my temporary iptables "deny" rules (3400 attempts on Monday, about 1800 each on Tue, Wed & Today...)  I suppose this would "solve" the problem temporarily whether it's my suspected denial-of-service attack or a dnsbl overload.

I thought that tinydns would cache the dnsbl queries, reducing dns bandwidth requirements.  Is this not the case?

Have you ever tried the "greylisting" plugin?  It looks like it would significantly reduce the load on all of the other plugins, and is included with SME 7 but I've been afraid to turn it on since I've never used it...

Offline raem

  • *
  • 3,972
  • +4/-0
Re: Denial of Service, poor hardware, or bad software settin
« Reply #3 on: November 03, 2006, 09:26:19 AM »
mmccarn

> SME 7 Final installed from scratch

Have you done a
yum update
Reconfigure & Reboot


> smeserver-spamassassin-features-0.0.2-0.noarch.rpm

What is that rpm for and where did you get it from ?


> I plan to upgrade the RAM to 512MB to see if that has any effect

256Mb is minimal for sme7, I'd even up it to 1Gb if you can

You could tweak the settings, see
config show smtpd
config show qpsmtpd
add/reduce RBL lists ?
disable RHSBL if enabled
adjust qmail/qpsmtpd concurrency settings ?
...

Offline byte

  • *
  • 2,183
  • +2/-0
Re: Denial of Service, poor hardware, or bad software settin
« Reply #4 on: November 03, 2006, 10:42:22 AM »
Quote from: "RayMitchell"

> smeserver-spamassassin-features-0.0.2-0.noarch.rpm

What is that rpm for and where did you get it from ?


From here...

http://mirror.contribs.org/smeserver/contribs/michaelw/sme7/
--[byte]--

Have you filled in a Bug Report over @ http://bugs.contribs.org ? Please don't wait to be told this way you help us to help you/others - Thanks!

Offline piran

  • ****
  • 502
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #5 on: November 03, 2006, 10:42:28 AM »
In addition to Ray's help... a sideways thought on CPU cycles. Yes, you
don't seem to have very much RAM but what about CPU horsepower? (933)

SpamAssassin uses up quite a bit of CPU. With such volumes of traffic you
might consider switching off anything to do with SA, get all the connections
and possible traffic through the server in real time. Yes, you'd need to rely
on the workstations picking out any malware but perhaps the choking of
the server might be addressed. The other RBL etc plugins do quite a bit
of incredibly effective filtering... here on my site they are 100% effective
and I haven't yet (!?) had the urge to turn on SA (ever).

Just a thought.

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #6 on: November 03, 2006, 03:08:34 PM »
Quote from: "RayMitchell"
Have you done a
yum update
Reconfigure & Reboot
Yes; I forgot to mention that.

Quote from: "RayMitchell"
> smeserver-spamassassin-features-0.0.2-0.noarch.rpm

What is that rpm for and where did you get it from ?
This rpm from Michael Weinberger is "an RPM that enables site-wide Bayesian Filter, enables blacklist testing and intodruces spamassassin properties BayesAutoLearnThresholdSpam and BayesAutoLearnThresholdNonspam".  For more info, see  Spam Filter

Code: [Select]
smtpd=service
    Authentication=disabled
    DenyHosts=58.246.197.21
    Instances=25
    InstancesPerIP=5
    MaximumDateOffset=0
    PatternsScan=disabled
    Proxy=disabled
    TCPPort=25
    TCPProxyPort=25
    VirusScan=enabled
    access=public
    status=enabled
    tnef2mime=enabled
I've changed "Instances" from 40 to 25  which seems to work fine - as long as things are working OK I haven't had over 9 concurrent connections.  I tried "Instances=80", but that totally killed me!
Code: [Select]
qpsmtpd=service
    Bcc=disabled
    BccUser=maillog
    DNSBL=enabled
    LogLevel=8
    MaxScannerSize=25000000
    RBLList=sbl-xbl.spamhaus.org,whois.rfc-ignorant.org,dnsbl.njabl.org,relays.ordb.org
    RHSBL=enabled
    RequireResolvableFromHost=yes
    SBLList=dsn.rfc-ignorant.org
    access=public
    status=enabled


Ray: Why do you recommend disabling RHSBL checks?

piran: I agree that it is likely that disabling SA would solve my problem...but what is the effect on the amount of SPAM delivered to end-users?  The email is passed to an internal Exchange server that is running AVG Antivirus for Exchange as well as [oxymoron]Microsoft Intelligent[/oxymoron] Message Filtering, so my users might not get hit too hard...

If Ray ("up it to 1Gb if you can") and piran ("what about CPU horsepower") are correct, should the SME 7 minimum hardware requirements (400MHz CPU, 128MB RAM) be changed?  (OK, I'll answer this one myself: The SME 7 Manual says, speaking of the Minimum Hardware Requirements Note that we do not believe such a system will provide satisfactory performance for features such as webmail, remote access via PPTP, Virus and Spam Scanners, which are cpu intensive will not perform well on this platform.  It's amazing what you learn if you read the text instead of just glancing at the pictures and charts!).

Offline raem

  • *
  • 3,972
  • +4/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #7 on: November 04, 2006, 09:48:42 AM »
mmccarn

> Why do you recommend disabling RHSBL checks?

There was some comments on devinfo (I think) about the effectiveness or otherwise of RHSBL, there are issues that you should be aware of if you choose to enable it.

DNS RBL is OK to enable, but you should also be aware of the consequences too, ie potential loss of email messages from legitimate senders wrongly listed on RBL's, particularly if you use non conservative lists.

Neither of these is enabled by default on sme7.

It was also related to the comment
"What I am noticing though is that the number of DNS queries that the server is having to make is killing my upload bandwidth (512Kb) during the mail bursts."

So disabling one of those look up features may help improve speed, a case of try it and see.
...

Offline JonB

  • *
  • 351
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #8 on: November 28, 2006, 09:49:30 PM »
I thought that I had better reply to this to report that my issues have been solved.

I had initially thought that at times the large number of DNS queries that the server was making, in response to the large number of spam emails I was receiving (> 2000/hour) was killing my upload bandwidth (512Kb). Internet access became impossible and mail was being delayed due to hosts not being found.

Yesterday I finally figured out what was going on. It was my ADSL router that was choking the connections. It was basically running out of memory and cpu to process all the connections through the NAT firewall on the router.

Here in NZ almost all ISP's use PPPoA for ADSL connections and bridging to disable the NAT firewall is not an option.

However my ADSL router has an option called Half-bridge mode, which when enabled uses the DHCP server in the router to forward the public IP to the external interface of the server, bypassing the NAT firewall.

Since I have reconfigured the ADSL router to half-bridge mode and the server external interface to use DHCP to get the IP address I have had no issues with the router choking the bandwidth.

Jon
...

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #9 on: November 29, 2006, 03:18:35 AM »
I tried upgrading RAM only to find that my new RAM caused freeze-ups (Aargh!)

Of my two systems, I'm running one with RHSBL and Spamassassin disabled, and the other with an extensive IPTables "block" list for smtp servers that show up as early talkers or that are denied by DNSBL.

Either solution seems to keep the server humming along.

(both of these systems already had public IPs).

Offline robwellesley

  • *
  • 92
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #10 on: December 19, 2006, 10:45:21 AM »
Quote from: "JonB"
..... and the server external interface to use DHCP to get the IP address ....
Jon


How did you do that Jon?

Rob

Offline JonB

  • *
  • 351
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #11 on: December 19, 2006, 11:22:17 AM »
Rob,

You need to log into the console as admin and reconfigure the external interface to use option 2

 2.    Use DHCP (send ethernet address as client identifier)

The ADSL router in Half-bridge mode will use the DHCP server in the router to forward the public IP to the external interface of the SME, bypassing the NAT firewall on the router.

Jon
...

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #12 on: January 02, 2007, 04:04:59 PM »
A followup note on my original problems in order to help others with limited hardware resources:
    1. I never found any RAM that worked well in these systems although I didn't look too hard once the system performance stabilized.
    2. One system (PIII/256MB) has run OK for 35 days now with RHSBL disabled but with spamassassin enabled.
    3. A second system (PIII/192MB) has run OK for 35 days with both RHSBL and spamassassin disabled.
    4. I re-enabled spamassassin on the 192MB system 2 days ago, resulting in a clear increase in CPU, swap, & memory use, but the system has been fine so far.
    5. This client is heavily involved in US politics, so my original problems could well have been related to heavier-than-normal email or other traffic related to the 2006 US elections.

Offline robwellesley

  • *
  • 92
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #13 on: January 15, 2007, 12:29:34 AM »
Quote from: "JonB"
Rob,

You need to log into the console as admin and reconfigure the external interface to use option 2

 2.    Use DHCP (send ethernet address as client identifier)

The ADSL router in Half-bridge mode will use the DHCP server in the router to forward the public IP to the external interface of the SME, bypassing the NAT firewall on the router.

Jon


So an ifconfig at the console will show the public IP as the IP for the 'external' NIC?

Offline JonB

  • *
  • 351
  • +0/-0
Denial of Service, poor hardware, or bad software settings?
« Reply #14 on: January 15, 2007, 01:21:44 AM »
Rob,

Correct. If you like give me a call 021 326419.

Jon
...