Koozali.org: home of the SME Server

dnscache cname bugs

Offline hanscees

  • *
  • 267
  • +0/-0
    • nl.linkedin.com/in/hanscees/
dnscache cname bugs
« on: January 01, 2005, 09:30:09 PM »
Hi,

I have problems with the dns resolving by dnscache on 6.01.
The server as.nu.nl is not resolvable. Many other cnamed servers also not. It also looks like cnames are never cached. This degrades the performance of browsing hugely: when it has such a servfail all other dns traffic also slows or halts. Killing dnscache then is a solution.

The log in /var/log/dnscache/current shows servfails.

This url:
http://homepages.tesco.net/~J.deBoynePollard/Softwares/djbdns/#dnscache-cname-handling
shows possible problems with dnscache.

Am I the only one with these problems or are there more people?

Can the binary be patched for this?

greetings

Hans-Cees




the as.nu.nl dns query by a bind query:
;; ANSWER SECTION:
as.nu.nl.               5M IN CNAME     falk.speedera.net.
falk.speedera.net.      2M IN A         80.15.238.70
falk.speedera.net.      2M IN A         212.3.243.138

;; AUTHORITY SECTION:
speedera.net.           1d4h4m37s IN NS  a.speedera.net.
speedera.net.           1d4h4m37s IN NS  f.speedera.net.
speedera.net.           1d4h4m37s IN NS  h.speedera.net.
speedera.net.           1d4h4m37s IN NS  m.speedera.net.
speedera.net.           1d4h4m37s IN NS  x.speedera.net.
speedera.net.           1d4h4m37s IN NS  y.speedera.net.
speedera.net.           1d4h4m37s IN NS  z.speedera.net.

;; ADDITIONAL SECTION:
a.speedera.net.         6h12m3s IN A    208.185.54.61
f.speedera.net.         6h12m3s IN A    210.224.186.3
h.speedera.net.         6h12m3s IN A    64.14.117.35
m.speedera.net.         6h12m3s IN A    212.162.1.222
x.speedera.net.         6h12m3s IN A    64.41.146.225
y.speedera.net.         6h12m3s IN A    212.187.170.30
z.speedera.net.         6h12m3s IN A    216.200.69.12
nl.linkedin.com/in/hanscees/

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: dnscache cname bugs
« Reply #1 on: January 03, 2005, 02:19:57 AM »
Quote from: "hanscees"
Hi,
Can the binary be patched for this?


No. The license under which djbdns is released only allows distribution of binaries built from unpatched source.

Offline hanscees

  • *
  • 267
  • +0/-0
    • nl.linkedin.com/in/hanscees/
Re: dnscache cname bugs
« Reply #2 on: January 03, 2005, 01:53:55 PM »
Quote from: "CharlieBrady"
Quote from: "hanscees"
Hi,
Can the binary be patched for this?


No. The license under which djbdns is released only allows distribution of binaries built from unpatched source.


Well, then there are two questions really. First of all, are more people having troubles? I would think that is very likely. However, I did do an upgrade from 5.6 to 6.01 so perhaps this is something peculiar with my setup (can't imagen what though).

Second if there are more complaints, and it cannot be fixed I for one would like dnscache removed al together and bind installed again in some upgrade.
For me this makes the server unusable: dns should simply work right.

Hans-Cees
nl.linkedin.com/in/hanscees/

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: dnscache cname bugs
« Reply #3 on: January 03, 2005, 07:29:31 PM »
Quote from: "hanscees"

Well, then there are two questions really. First of all, are more people having troubles? I would think that is very likely.


With resolution of as.nu.nl? Yes - I can see that problem. But the problem is clearly the configuration of the nu.nl name servers. I've asked on the djbdns list, and most people there can resolve the name. So the difference you are "seeing" with bind may be illusory - their queries are coming from different addresses, and they may get correct responses from the name servers.
Quote

However, I did do an upgrade from 5.6 to 6.01 so perhaps this is something peculiar with my setup (can't imagen what though).

No, I don't think that's likely.

Quote

Second if there are more complaints, and it cannot be fixed I for one would like dnscache removed al together and bind installed again in some upgrade.


If you think that's a good idea, then go ahead and do it. Or pay someone to do it. It shouldn't be hard - start with the old e-smith-named.

Quote

For me this makes the server unusable: dns should simply work right.


While ever there are badly configured name servers on the Internet, there will be some problems with name lookup, whichever software you use for the name lookup.

Offline hanscees

  • *
  • 267
  • +0/-0
    • nl.linkedin.com/in/hanscees/
Re: dnscache cname bugs
« Reply #4 on: January 03, 2005, 08:45:54 PM »
Quote from: "CharlieBrady"
Quote from: "hanscees"

Well, then there are two questions really. First of all, are more people having troubles? I would think that is very likely.


With resolution of as.nu.nl? Yes - I can see that problem. But the problem is clearly the configuration of the nu.nl name servers. But see below, since this is not really relevant for my point.
[/quote]

Could you please expand on this. I cannot analyse this where I am today. From my ip address I do not get any record of as.nu.nl on any of the dns servers of nu.nl: that is  
ns1.nu.nl.              86400   IN      A       62.69.162.130
ns2.nu.nl.              86400   IN      A       62.69.162.131
ns3.nu.nl.              86400   IN      A       62.69.162.132

Anyway for my point this is not terribly important.


Quote

I've asked on the djbdns list, and most people there can resolve the name. So the difference you are "seeing" with bind may be illusory - their queries are coming from different addresses, and they may get correct responses from the name servers.
Quote

However, I did do an upgrade from 5.6 to 6.01 so perhaps this is something peculiar with my setup (can't imagen what though).

No, I don't think that's likely.


Quote

Second if there are more complaints, and it cannot be fixed I for one would like dnscache removed al together and bind installed again in some upgrade.


If you think that's a good idea, then go ahead and do it. Or pay someone to do it. It shouldn't be hard - start with the old e-smith-named.

Quote

For me this makes the server unusable: dns should simply work right.


While ever there are badly configured name servers on the Internet, there will be some problems with name lookup, whichever software you use for the name lookup.


I know it is very easy to have complaints, but some complaints do have more weight than others. I am trying to deliver a point here: not just critisize. I hope that is clear.

The problem I have with dnscache is not just that some zones do not get resolved. However, the problem I have that when this happens, all dns resolving to dnscache, thus from networks behind the server, totally and utterly blocks for minutes or longer.

We are only two people here having real problems with this. In any serious company network this would make surfing impossible.

So that is why I am surprised to find no others have seen this too: it should happen often if I have a common setup. But perhaps most folks do not use dnscache and resolve for themselves to the root-servers or with their isp.

So my point shot not be taken that some sites do not get resolved by dnscache. It is that this blocks all dns traffic for some time. This is what I would think everybody would find a serious problem.

And I am explicitly not crying out for anyone to fix this and fix it now. What I am trying to do is to discuss how serious the issue is and to see what options there are.

friendly greetings

Hans-Cees
nl.linkedin.com/in/hanscees/

Offline hanscees

  • *
  • 267
  • +0/-0
    • nl.linkedin.com/in/hanscees/
some more dnscache digging about as.nu.nl
« Reply #5 on: January 04, 2005, 07:56:21 PM »
Hi,
I red the postings on the djbns mailinglist:
http://marc.theaimsgroup.com/?l=djbdns&m=110478767318024&w=2

I did some testing to get to the bottom of this as good as I can.

When I do a dig as.nu.nl on my server with dnscache the query is as follows:

Domain Name System (query)
    Transaction ID: 0x81b0
    Flags: 0x0100 (Standard query)
        0... .... .... .... = Response: Message is a query
        .000 0... .... .... = Opcode: Standard query (0)
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... .0.. .... = Z: reserved (0)
        .... .... ...0 .... = Non-authenticated data OK: Non-authenticated data is unacceptable
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 0
    Queries

the answer is as follows:

Internet Protocol, Src Addr: 62.69.162.131 (62.69.162.131), Dst Addr: 62.216.12.164 (62.216.12.164)
User Datagram Protocol, Src Port: domain (53), Dst Port: 30462 (30462)
Domain Name System (response)
    Transaction ID: 0x81b0
    Flags: 0x8185 (Standard query response, Refused)
        1... .... .... .... = Response: Message is a response
        .000 0... .... .... = Opcode: Standard query (0)
        .... .0.. .... .... = Authoritative: Server is not an authority for domain
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... 1... .... = Recursion available: Server can do recursive queries
        .... .... .0.. .... = Z: reserved (0)
        .... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
        .... .... .... 0101 = Reply code: Refused (5)
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 0
    Queries


If I do an dns query by dig@62.69.162.131 as.nu.nl I only get an answer when I do +norecurs.

In the dump I see that the only bit set different is the recursive bit. So this makes the difference.

When I did the same on a server with bind, it allways sets the norecursive bit off.

11:48:46.525977 216.145.223.225.64960 > 128.63.2.53.53:  58855 [1au] A?
as.nu.nl. (37)
11:48:46.635677 128.63.2.53.53 > 217.149.223.225.64960:  58855- 0/7/10
(375) (DF)
11:48:46.636665 217.149.223.225.64960 > 192.36.144.116.53:  59936 [1au] A?
as.nu.nl. (37)
11:48:46.683251 192.36.144.116.53 > 217.149.223.225.64960:  59936- 0/3/4
(139) (DF)
11:48:46.683645 217.149.223.225.64960 > 62.69.162.130.53:  221 [1au] A?
as.nu.nl. (37)
11:48:46.697885 62.69.162.130.53 > 217.149.223.225.64960:  221* 1/0/1
CNAME falk.speedera.net. (68)
11:48:46.698206 217.149.223.225.64960 > 192.203.230.10.53:  16046 [1au] A?
falk.speedera.net. (46)
11:48:46.871986 192.203.230.10.53 > 217.149.223.225.64960:  16046- 0/13/16
(531)
11:48:46.874029 217.149.223.225.64960 > 192.54.112.30.53:  50275 [1au] A?
falk.speedera.net. (46)
11:48:46.977148 192.54.112.30.53 > 217.149.223.225.64960:  50275 FormErr-
[0q] 0/0/0 (12) (DF)
11:48:46.977270 217.149.223.225.64960 > 192.54.112.30.53:  50275 A?
falk.speedera.net. (35)
11:48:47.085167 192.54.112.30.53 > 217.149.223.225.64960:  50275- 0/7/7
(259) (DF)
11:48:47.086064 214.145.223.225.64960 > 212.187.170.30.53:  6156 [1au] A?
falk.speedera.net. (46)


This seems to be not the case with my e-smith dnscache implementation.

From the djbdns maildiscussion I do not get a clear picture as what is going on. Could it be that some environment variable is set wrong?

To set the FORWARDONLY environment variable for dnscache:
    echo 1 > /service/dnscache/env/FORWARDONLY

My forwardonly is set on 0, so that is normal I think.

However, my dump also shows nicely that the dnscache server tries 14 times as.nu.nl and gets 12 times refused. It cycles through all three dns servers for nu.nl.

To show that the problem is not academic amnd happens more:
[root@idsnew dnscache]# egrep fail * | wc -l
    161

egrep fail * | awk  '{print $2 " " $3}' | sort | uniq -c

     1   servfail 1.1.168.192.in-addr.arpa.
      1   servfail 119.56.121.195.in-addr.arpa.
      2   servfail 123.220.42.64.in-addr.arpa.
      2   servfail 130.21.252.211.in-addr.arpa.
      3   servfail 141.22.173.61.in-addr.arpa.
      2   servfail 142.122.208.208.in-addr.arpa.
      1   servfail 146.144.214.214.209.in-addr.arpa.
      2   servfail 151.22.174.82.in-addr.arpa.
      1   servfail 165.159.181.67.in-addr.arpa.
      3   servfail 183.23.173.61.in-addr.arpa.
      2   servfail 19.200.254.65.in-addr.arpa.
      2   servfail 203.88.34.207.in-addr.arpa.
      1   servfail 233.190.116.83.in-addr.arpa.
      1   servfail 234.1.168.192.in-addr.arpa.
      1   servfail 24.13.60.68.in-addr.arpa.
      2   servfail 24.23.174.82.in-addr.arpa.
      2   servfail 2.56.196.81.in-addr.arpa.
      2   servfail 29.67.174.82.in-addr.arpa.
      2   servfail 36.189.236.205.in-addr.arpa.
      2   servfail 49.218.234.216.in-addr.arpa.
      1   servfail 61.190.118.83.in-addr.arpa.
      4   servfail 70.222.98.61.in-addr.arpa.
      2   servfail 77.22.174.82.in-addr.arpa.
      1   servfail 83.130.117.83.in-addr.arpa.
      2   servfail 89.48.100.66.in-addr.arpa.
      6   servfail as.nu.nl.
     15   servfail ds.serving-sys.com.
      1   servfail ns.telepac.pt.
     20   servfail www.fs.fed.us.
      5   servfail www.home-klimat.info.
      1   servfail www.regsoft.net.
      5   servfail www.uvlagnitel.info.
     20   servfail www.webvragenlijst.nl.
      5   servfail www.woonactueel.nl.

greetings

Hans-Cees
nl.linkedin.com/in/hanscees/

Offline CharlieBrady

  • *
  • 6,918
  • +3/-0
Re: some more dnscache digging about as.nu.nl
« Reply #6 on: January 04, 2005, 08:20:28 PM »
Quote from: "hanscees"

In the dump I see that the only bit set different is the recursive bit. So this makes the difference.


Yep.

Quote

To set the FORWARDONLY environment variable for dnscache:
    echo 1 > /service/dnscache/env/FORWARDONLY

My forwardonly is set on 0, so that is normal I think.


No, that is where the error is. Being set to 0 is the same to dnscache as being set to 1. What needs to happen is for the environment variable not to be set at all.

You can do that temporarily by doing:

rm /service/dnscache/env/FORWARDONLY
svc -t /service/dnscache

You can do it permanently by grabbing the latest e-smith-dnscache package (e-smith-dnscache-0.3.0-04) from Mitel's devel directory on one of the mirror sites, then doing:

/etc/e-smith/events/actions/dnscache-conf

Quote

To show that the problem is not academic amnd happens more:
[root@idsnew dnscache]# egrep fail * | wc -l
    161

egrep fail * | awk  '{print $2 " " $3}' | sort | uniq -c

     1   servfail 1.1.168.192.in-addr.arpa.
      1   servfail 119.56.121.195.in-addr.arpa.
      2   servfail 123.220.42.64.in-addr.arpa.
      2   servfail 130.21.252.211.in-addr.arpa.
      3   servfail 141.22.173.61.in-addr.arpa.
      2   servfail 142.122.208.208.in-addr.arpa.
      1   servfail 146.144.214.214.209.in-addr.arpa.
      2   servfail 151.22.174.82.in-addr.arpa.
      1   servfail 165.159.181.67.in-addr.arpa.
      3   servfail 183.23.173.61.in-addr.arpa.
      2   servfail 19.200.254.65.in-addr.arpa.
      2   servfail 203.88.34.207.in-addr.arpa.
      1   servfail 233.190.116.83.in-addr.arpa.
      1   servfail 234.1.168.192.in-addr.arpa.
      1   servfail 24.13.60.68.in-addr.arpa.
      2   servfail 24.23.174.82.in-addr.arpa.
      2   servfail 2.56.196.81.in-addr.arpa.
      2   servfail 29.67.174.82.in-addr.arpa.
      2   servfail 36.189.236.205.in-addr.arpa.
      2   servfail 49.218.234.216.in-addr.arpa.
      1   servfail 61.190.118.83.in-addr.arpa.
      4   servfail 70.222.98.61.in-addr.arpa.
      2   servfail 77.22.174.82.in-addr.arpa.
      1   servfail 83.130.117.83.in-addr.arpa.
      2   servfail 89.48.100.66.in-addr.arpa.
      6   servfail as.nu.nl.
     15   servfail ds.serving-sys.com.
      1   servfail ns.telepac.pt.
     20   servfail www.fs.fed.us.
      5   servfail www.home-klimat.info.
      1   servfail www.regsoft.net.
      5   servfail www.uvlagnitel.info.
     20   servfail www.webvragenlijst.nl.
      5   servfail www.woonactueel.nl.


You'll get some improvement in that by making the change I recommend above. I think you'll find most or all of the in-addr.arpa. lookups to still have problems - just due to bad delegations, or delegations to servers which aren't accessible. You'll also find that no matter what DNS resolver software you use, you'll find occasional problems that are beyond your control.

Offline hanscees

  • *
  • 267
  • +0/-0
    • nl.linkedin.com/in/hanscees/
dnscache cname bugs
« Reply #7 on: January 04, 2005, 08:31:45 PM »
You just made me very happy!!

I removed /service/dnscache/env/FORWARDONLY

did

svc -t /service/dnscache

and voila:


[root@idsnew env]# dig as.nu.nl

; <<>> DiG 9.2.1 <<>> as.nu.nl
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25175
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;as.nu.nl.                      IN      A

;; ANSWER SECTION:
as.nu.nl.               300     IN      CNAME   falk.speedera.net.
falk.speedera.net.      120     IN      A       213.200.99.195
falk.speedera.net.      120     IN      A       213.200.99.196

As soon as there is a new e-smith iso with all things upto date I will do a brand-new install.
Thanks a lot for helping me!

I know dns is always a pain in the ass: I help people with it everyday. Especially reverse is a mess often.

Please have a look here
http://www.bomengids.nl/uk/soortenusa/quaking_aspen__populus_tremuloides_bryce_torrey.html
for some pictures on my server you just fixed to help you relax (don't worry, just trees.)


Hans-Cees
nl.linkedin.com/in/hanscees/

mbachmann

dnscache cname bugs
« Reply #8 on: January 10, 2005, 08:49:25 AM »
No probs with that here.

; <<>> DiG 9.2.1 <<>> as.nu.nl
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24719
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;as.nu.nl.                      IN      A

;; ANSWER SECTION:
as.nu.nl.               300     IN      CNAME   falk.speedera.net.
falk.speedera.net.      120     IN      A       213.200.99.195
falk.speedera.net.      120     IN      A       212.187.170.12

;; Query time: 244 msec
;; SERVER: 192.168.1.100#53(192.168.1.100)
;; WHEN: Mon Jan 10 08:37:46 2005
;; MSG SIZE  rcvd: 89