AAISP.net Broadband - Broadband you can work with

Skip to Navigation / Skip to Content

Knowledge base Dual LNS operation

This article is rather technical and covers the way we connect to BT for broadband links. It relates to 21CN lines, and from August 2009 to 20CN lines as well.

Dual redundant links to BT

To provide the best possible service we aim to have no single points of failure in our network. It is not easy, but the plan is that any switch or router or server or cable could break and service continue (possibly with a "blip"). There will always be cases where things partly break which can cause problems, but in most cases something either works or breaks. In some cases even a partial failure can be detected and automatic action taken to move lines to the backup - this is part of the design of the LNSs.

For broadband we have two high speed links to BT, each on its own fibre. We terminate these each with an LNS (L2TP Network Server) which communicates with BT and handles your broadband connections. These connect to our core network via two switches and at least two BGP routers.

An LNS handles the session from you to us. This is from the login onwards. If there is a problem with the LNS to which you are connected your session drops, and reconnects. It can reconnect to the other LNS.

Main and backup LNS

We operate the two LNSs as a main and backup. This means all sessions connect to the main LNS normally and only if it fails do sessions connect to the backup (automatically). Using one LNS is important for managing and metering bandwidth of the link to BT and for maintenance.

IPStream connect and session steering

For 21CN we get a request from BT to ask which LNS we would like to use. We have a main and backup RADIUS server for this request and reply with the appropriate LNS. This means all 21CN line connect to the main LNS as a matter of course unless something is broken. The platform RADIUS which handles this request and the LNS and the fibre to BT are all tied together so a failure will switch everything to the backup LNS automatically.

For 20CN BT do have this initial question but do not allow us to tell them which LNS to use. This lack of session steering is a major problem for all ISPs but BT have refused to provide this (even though they successfully trialled it in the past). Fortunately the FireBrick FB6000 LNS now has a feature to specifically address this which means we are able to reject the wrong LNS request early enough so that BT divert the connection to the correct LNS within a fraction of a second, but we still have the same automatic fallback in the event of failure.

Stray connections

If, for some reason, BT do not see a platform RADIUS response from us, they try the backup. This will cause a connection to the backup LNS. We clear these occasionally if they happen - mostly over night.

Maintenance

We have a programme of ongoing improvement. This means that we may load new software on to an LNS occasionally. The process involves switching lines to an alternate LNS over night. This is announced on the planned work and normally at a weekend of early in the morning. The process is a simple PPP kill on each line, one at a time, so means a few seconds of outage while your router re-connects to the new LNS.

Multiple lines

Sometimes, before we switch LNS we will set new connections to go to the new LNS for a while (a few hours maybe). This is an extra precaution to bring lines on slowly (as they reconnect for other reasons) and confirm all is well. Usually this is not necessary. However, if we do this, it is possible to have one of a multiple line login on one LNS and one on the other. This can also happen if any issues with BT platform RADIUS or if there is a major issue of some sort. If this happens the service will work as normal but downlink routing will not be load shared. It will be via one of the LNSs only. You can PPP kill (on control pages) or simply restart the router for the line on the other LNS to force both on to the same LNS if you wish.

PPP kill

What we mean by a PPP Kill is that the session, the connection, that joins your router to our LNS is stopped (killed). Your router then reconnects. Some routers reconnect in under 5 seconds, but some can take longer. In some rare cases a router will sulk and need manually restarting and we would recommend changing the router if this happens. Generally the process means that a line will be off from 5 to 10 seconds and then reconnect (slightly longer for 20CN lines). We can do a PPP kill, such as when moving lines between LNSs for a software upgrade. You can also do a PPP kill, either telling your router to reconnect via its user interface or from our control pages. A PPP kill does not mean your line re-syncs (unless your router is particularly strange) and so will have no impact on sync rates and other settings which can be affected by a res-sync.

Graphs

We keep graphs for packet loss, latency and throughput and these come from the LNS. This means if any any time you change from one LNS to another the graphs are split. During the day the current graph is always from the LNS you last logged in to. Over night the graphs are archived. When viewing previous days the graphs from multiple LNS are merged.

Usually this creates a good graph for the previous day. However, in some cases the previous day on one of the LNSs may have been deleted before archive. In some cases the two LNS may operate on a different scale and so the combined graph looks messy and confusing. In some cases we change the graph format slightly and this can also lead to a messy graph.