~ Basic Help ~

				Basic help

(Courtesy of fravia's searchlore.org)

(¯`·.¸(¯`·.¸ A bird's eye view ¸.·´¯)¸.·´¯)
by _dose

published @ searchlores in Mai 2002

_dose is a Linux +HCU guru and security expert, but first of all a good friend, that I have had the pleasure to meet many times around Europe. He is trying (a little :-) to keep the technical jargon to a minimum. But 'technically jargoned' or not, this reading -believe me- will be a salutary enrichment for you. Sparkles of knowledge fizzle around this script... and may even motivate readers to contribute on their own... one sunny day...

A bird's eye view


an essay on the fabric of the Internet

by _dose, 05/2002

"You can read it?"
"No, I was just clearing my throat."

introduction

This is an introduction to how the internet is built. You see, most people don't know how this things works in the first place, even if they work in technical areas. The only people that know are those that work in the higher end networking areas of ISPs. But I figure that some background never hurt anybody and since we all use this in one function or another I'd give a shot at briefly outlining the structure of the global internet. And who knows? At some point this information might actually be useful for you.

I'll keep the technical jargon to a minimum, or at least I'll try. A little.

an opening

The Internet is a massive world-wide interconnected network - not a network, an interconnected one. That where the name Internet comes from. This means that there are lots of networks out there that can talk to each other or use networks that they're connected to to communicate with networks they're not connected to. A single network can be built up of many things. Some networks exist only in one datacenter, others are built up of national, international or intercontinental links. The important thing here is that each network - whatever its size, is controlled and maintained by one administrative entity (usually a corporation). These networks are referred to as Autonomous Systems (AS). Each AS is assigned a unique 16-bit number called an Autonomous System Number (ASN).

Now for some administrative politics. There is a central organisation that deals with assigning 'numbers'. This is the Internet Assigned Numbers Authority (IANA). They control the delegation of ASNs, IP ranges and other numeric resources that require a central authority. Historically they have directly assigned IP blocks and ASNs to corporations or organisations, this is no longer the case. As the Internet grew, seperate organisations were set up to deal with this process. These organisations are called 'regional registries'. Broadly speaking (very broadly,) there are three of these. They are organised geographically. They are ARIN (American Registry for Internet Numbers), APNIC (Asian-Pacific Network Information Center) and RIPE (Reseaux IP Europeen, which translates as European IP Networks). These organistations can be found at,

http://www.iana.org/ - Top level
http://www.arin.net/ - North and South America, Carribean and sub-Sahara Africa
http://www.ripe.net/ - Europe, northern Africa, the Middle East and parts of northern Asia
http://www.apnic.net/ - Asia-Pacific region

When an organisation wants to build a network that actively participates in the global internet (as opposed to simply buying traffic from an ISP), it will request an ASN and IP allocation from the regional registry for their region. The most important part is receiving an ASN (I will elaborate on this later) and after that an IP allocation.

An IP allocation is one or more blocks of contiguous IP addresses that this organisation receives and is unique to this organisation. Actually, this is not entirely accurate. The IP allocation is tied to the ASN, but more on that later. An organisation can have more than one ASN, this is usually the result of mergers or acquisitions. The regional registries will not assign more than one ASN to one organisation, but once two organisations merge then the new entity holds both ASNs. Massive acquisitions have led to single organisations holding many ASNs, but usually they will consolidate all IP allocations associated with these ASNs to one single ASN and return the ASNs to the regional registries. They will however hold on to the IP allocations, as IP space is very valuable.

This brings me to a side note. IP addresses being sold by an ISP. Many ISPs charge money for assigning IP addresses to a customer company. This is acceptable as it is a one time administrative cost, but IP addresses are not the property of an ISP. IP addresses are granted to an ISP for connectivity purposes. Any ISP that requires a company to pay a recurring fee for handing out IP addresses is in violation of Registry policy. IP addresses are not property, and the only money you can be charged by an ISP for an IP allocation from the IP addresses they have been assigned is money that covers the administrative costs they have. An organisation may not, under any circumstances, make a profit on IP assignments. Many of them still do, because no-one feels like taking them to court over it. Just one of the many things that are wrong on the internet... But I thought I'd mention it. Smack the next sales person you encounter asking money for IP addresses with this little tidbit, wiil you?

So after reading all this, you might want a few interesting links.
Over here http://www.iana.org/assignments/ipv4-address-space you can find a listing of all IP allocations by IANA. The notation used is CIDR, so 212/8 is a massive block (class A in the old style) covering all IP addresses from 212.0.0.0 to 212.255.255.255.

Here http://www.arin.net/statistics/2001stats.html you can find the total number of IP requests processed and AS numbers assigned by ARIN in 2001.

transport, and why it matters

As we all know, IP is the transport mechanism of the Internet. Network people always talk about 'layers'. In this they refer to the OSI model. OSI stands for Open Systems Interconnect. It is a model that was developed for universal network-oriented communication in the late 1980s and early 1990s. The Internet was built around a different model. Actually, there is not formal model that the Internet was designed around, that idea came later. But the model is called the DoD model - for Department of Defence. As we all know, the Internet evolved from the DARPA network experiments. DARPA stands for Defence Advanced Research Projects Agency. This is where IP and higher level protocols such as TCP, UDP and ICMP were developed. The war between the DoD model and OSI model was a long one, but OSI failed miserably in this. OSI has a 7 layer model detailing everything from the physical transmission media to the abstract application interfaces. The DoD model for networking won this battle simply because Internet communications became the de facto standard. The OSI model is still referred to, but no-one in networking uses its concepts above layer 4 (where TCP and friends reside). Basically, layer 1 and 2 refer to the physical media and the ways in which signals are exchanged over these media (Ethernet is a layer 1 and 2 standard). IP is a layer 3 protocol, called the 'network' protocol. TCP is a layer 4 protocol (called the 'transport' layer). Higher layers of the OSI model are called 'session', 'application' and 'presentation'.

This whole model works very well. The basics are pumped into each networking newbie and are referred to at all levels. For LAN technologies running our beloved TCP/IP protocols, that is. There are a lot of other networking technologies out there, and once you start mixing datacom and telecom technologies the entire model falls to pieces.

So IP is a network layer, not a transport layer. But that's just semantics. IP is what's used to transport data across a network. That's why I (and many others) call it a transport mechanism. At layer 3 we find the routers, pieces of network equipment that forward packets of data based on their IP addresses (which are logical, not physical).

You might be wondering now how IP packets go from one place to another. This is a very good question. On a local network you don't need IP, you can communicate with other machines just fine using layer 2. Basically you just send a signal onto the wire and, say, Ethernet will take care of the rest. But if you want to go outside you'll want your packets to be routed. Think of routing as sending a packet into a different domain. One that has no awareness of your physical networking media. You don't really think that the machines at http://www.google.com know if you're on a dial-up link, DSL or a LAN connection, do you?

If you want to 'go outside', your packets will reach a router. A router has multiple interfaces and a routing table. It receives your packet and looks at where that packet wants to go. It then looks into its routing table to decide which interface its going to forward that packet to. Most of the time your packet will have a destination address that it doesn't know about. So it'll send that packet to another router that is higher up in the hierarchy. If this packet has to go onto the Internet, it will be sent upwards and upwards in the network. Until it reaches the border of the network. Here it is passed to a router that is connected to the Internet.

it's not getting any better

Now we reach the interesting part. You see, an ISP has to connect to the Internet. It has to have some way of pushing IP packets to other networks. So we're back at the beginning and the reason I'm writing this. Pushing packets inside your own network is tricky enough. How to do this when you want them to go outside? Well, there are several ways. Most ISPs have routers at Internet Exchanges (IX s), also known as Network Access Points (NAPs). These Exchange points usually have a common media (usually an array of interconnected switches) that each member is connected to. Any two ISPs connected to such an IX can agree to exchange traffic with each other. So if there are 50 ISPs on this IX they can exchange traffic with each other, if they all decide that they want to. When two parties agree to exchange traffic they become 'peers'. So now your ISP can exchange traffic with all other ISPs on the switch that they have agreements with. But this still limits the networks they can reach to those they are peering with. An ISP in France isn't likely to have a peering agreement with an ISP in Korea.
For a list of Internet Exchange points, try http://www.colosource.com/ix.asp

So now we come to the Carriers. Carriers are networks that are vast and intercontinental. They exchange traffic with other Carriers. Examples of Carriers are KPN/QWest, MCI/Worldcom, Level3, etc. An ISP will usually sign a contract with one or more Carriers. The Carriers will then agree to accept traffic for any network and send traffic from any network to the ISP network if it's destination is that ISPs network.

So why were these ASN numbers necessary again? Well, these networks all build up their routing tables based upon the ASNs. At this level, the routers are not talking IP, they are talking another protocol, called BGP (Border Gateway Protocol). Using BGP these routers tell each other their ASN numbers and the ASNs they can reach.

The routers make decisions on where they send their packets based on AS policy. A Carrier will accept all traffic from an ISP router, but this costs money and if the router can send the packets directly to the network it wants to reach via the IX switch, it makes sense to use the switch because it is faster and cheaper. But if an ISP doesn't have a peering agreement with the network it wants to send data to then the router on the other side of the switch will not accept the traffic, so it has to be sent via the Carrier.

So at the heart of the Internet, routers don't look at IP addresses, they look at ASN numbers and what the best path is to send traffic to. Say a border router wants to send data to IP x.x.x.x. It will look up this address in its routing table - a table that is the result of BGP policy.

building pictures

All this talk about AS numbers and topology views will probably have you a bit confused. I promised you I would go easy on the technical jargon, and I have. (well, OK, I haven't really). But the concepts we're dealing with are a bit on the abstract side, so a certain amount of jargon is necessary. "_dose!" I hear you say, "you are boring the crap out of me!" And I hear you, and so I present a pretty picture to liven things up a bit.

Consider this example of an AS topology,

      AS-1 -----------                        - AS-21
       |              \---- (Carrier)--------/    |
       |                  /   AS-17  \            |
     [Exchange]          /            \          [Exchange]
     [ Switch ]         /              \         [ Switch ]
      |     |          /                \         |
      |     |         /                  \        |
    AS-2   AS-3 -----/                    ----- AS-42 ---- AS-33

AS-1 can reach the rest of the ASs via the following paths:
Over the exchange switch (via peering),

AS-1 - AS-2
AS-1 - AS-3

and over Transit (via the carrier),

AS-1 - AS-17
AS-1 - AS-17 - AS-3
AS-1 - AS-17 - AS-21
AS-1 - AS-17 - AS-42
AS-1 - AS-17 - AS-42 - AS-33

Apart from the fact that I will never be a graphic designer, the interesting details we can glean from the AS paths and the map are,

There is no path AS-1 - AS-3 - AS-17. AS-3 peers with AS-1, but does not allow AS-1 to use its network for transit.
Neither is there a path AS-1 - AS-17 - AS-21 - AS42.
AS-1 has two possible paths to AS-3, one is directly via peering and the other is via transit (AS-1 - AS-3 and AS-1 - AS-17 - AS-3, respectively).
AS-33 has no peering arrangements, and AS-42 provides transit for it.
AS-2 can only reach AS-1 and AS-3, but not the world outside the exchange switch.

The terminology can be a bit misleading sometimes (especially with the more esoteric constructions), but all routers that exchange routing information via BGP are called neighbours. The difference between 'peer' and 'transit provider' is that peers are equals and usually only exchange traffic that is bound to each others networks. A transit provider asks money for accepting your traffic and forwarding it to other networks.

Now, if you'd prefer to view some actual graphics, you can find an AS map of the internet at http://www.caida.org/analysis/topology/as_core_network/AS_Network.xml. You might want to read the description to make sense of it, though.

complications

The internet edges are tricky places to work. And often the work here goes wrong. Line failues, misconfigurations, updates going wrong and everyone's favorite, one small mistake triggering a landslide of failures affecting many networks.

Some say the Internet was designed to withstand a nuclear attack. This isn't true. The original ARPANet was designed to withstand the complete failure of one or two points, but the Internet as we know it today is a completely different animal. Many of the original protocols and concepts are still with us (much to the chagrin of security engineers), but the network has evolved into something different. Often I hear people saying "Well, if the US-EU links go down, we'll just route traffic via the Mid-East to Asia to the US West Coast, right?" Wrong. Take a look at this picture, http://www.telegeography.com/pubs/maps/internet/index.html - the wallpaper version at the bottom is a higher resolution. This picture shows us the intercontinental capacity of the IP datacom networks. Do you really think we could compensate for the US-EU loss by "routing that traffic somewhere else"? That said, it's not very likely that all the fibres across the Atlantic would fail at the same time.

If it were to happen, though, it would disrupt business communications between the EU and the rest of the world. It would disrupt private communications as well, but these are of lesser impact. Small to medium businesses would hardly notice, as most email and websites relevant to their operations are inside the EU region. There is no particular element of regional operation that requires the Americas to be reachable via the network. It might even improve productivity in many workplaces, as Hotmail, Yahoo and assorted web comics would also be unavailable :)

trusting systems

BGP neighbours basically tell each other which networks (ASs) they can reach and which IP ranges are associated with these networks. This way, each router can build their routing tables. However, there are generally few checks enforced on the routes a router will accept from its neighbours. Each neighbour has to be explicitly configured and after that is usually considered trusted.

In this way it is trivial for one router to announce IP ranges that don't belong to it, and other routers will happily propagate this information over the internet. Let's say that ISP-A accidentally announces an IP range belonging to ISP-B. ISP-A will then receive an amount of traffic that is destined for ISP-B. These things happen on occasion, and are generally fixed very fast. Operator mailing lists will carry this information and if ISP-A doesn't fix it's announcement, other ISPs will start to ignore its announcements. Effectively isolating ISP-A.

These annoucements are however quickly noticed because an ISP will actively monitor the announcements of its address space. It is a bit different with IP address space that has not been assigned yet. This address space is called murky address space because it does on occasion appear in the internets routing tables. This is usually attributed to advanced spammers with a lot of networking skills wishing to hide their tracks. They break into an ISPs routers and adjust its BGP configuration to also announce unassigned address space (which they use to send SPAM) after which they remove the announcements and effectively disappear. These, and other advanced attacks of SPAM, scams, etc are attributed to the dot-com fallout, which left a lot of very skilled people with no jobs.

internet size and growth

Many people try to measure the size of the Internet in the amount of people connected to it. This is of course, hopeless. But from a network engineering point of view there are other ways of measuring both the size and the growth of the net. The easiest to look up are the number of ASs, the amount of routes in the Internet routing tables, the IP space allocated by regional registries and the amount of traffic flowing through the main exchange points.

Currently, there are approximately 25,000 ASN assignments in use. The total number of unique networks will be significantly lower after the enormous amount of corporate mergers over the last few years. When two or more companies merge (or one is aquired by the other), the ASNs and IP allocation for both will still be held by the resulting company. It is considered proper behaviour to merge the ASNs and announce all IP allocation under 1 ASN and return superfluous IP ranges to the regional registry for re-allocation. However, this is a lot of work. And as obtaining IP addresses in the first place requires a lot of paperwork, most companies prefer to hold on to what they have.

A full Internet routing table holds approximately 120,000 unique routes today (mid 2002), up from 20,000 in 1994. A graph showing this growth can be found at http://www.mcvax.org/~jhma/routing/bgp-hist.html. This impacts not just the size of the Internet, but also the hardware necessary to process all these routes. Keep in mind that these routing tables are dynamic and change from minute to minute. Also, each packet traversing such a router has to be matched against the routing table (with its 120,000 possibilities) to determine its destination. Not just the capacity and speed of the hardware has improved to meet this challenge, the software performing the actual routing has become highly specialised to handle this. Additionally an Internet router usually has multiple high-speed interfaces (each usually between 100 Mbit/s and 1 Gigabit/s).

So how much traffic is there on the Internet? Also an impossible question to answer. A good starting point is however to view the traffic statistics of some major Internet Exchanges. One of the major European IXs is the AMS-IX (Amsterdam Internet Exchange), and its statistics can be viewed http://www.ams-ix.net/hugegraph.html here. Slightly over a year ago the aggregate traffic via peering alone here was 2 Gbit/s, now it is almost 6 Gbit/s. (As viewed in the monthly graph, in the daily graph we can see spikes over 8 Gbit/s). In this area, MFN - Metromedia Fibre Networks / Above.net have a large facility servicing IP traffic, with 9 public circuits carrying traffic to and from this location internationally. The traffic graphs for each of these circuits can be found here, http://www.mfn.com/network/ip_networkstatus.shtm#ams. The circuits are SDH STM-4 and STM-16s, 622.08 Mbps and 2.488 Gbps respectively. A high level overview of its network can be found here http://www.mfn.com/network/ip_networkmaps.shtm. And MFN/Above.net is far from being the only Carrier in the game.

future

The dot-com bubble may have burst, but the Internet hasn't gone down with it. In fact the net is still growing every day. Some providers go down, but there are plenty of others waiting to sell Internet access just around the corner. Of course this all costs money, but people want access. Much like the advancement of the telecommunications industry post World War II, the Internet is bound to keep on growing and probably take on a different, and more pervasive form in the future.

Already there are plans on the table for interplanetary Internet communications. We have the Consultative Committee for Space Data Systems, http://www.ccsds.org/, to name one. And of course the Interplanetary Internet Project http://www.ipnsig.org/home.htm, in case you were afraid of ending up on Mars without pornography.

I bet you were expecting a description of IP version 6 here, right? Sorry, I'll write about that once it is more widely deployed.

the end (you made it)

I hope that you have learned something through reading this essay. That is, after all, the point of my writing it in the first place. Thanks go to fatboyjoe for editorial feedback (feel free to blame him for the parts you don't like :), grugq for being grugq, mammon_ for Bastard and fravia+ for his past and present works.

I can usually be reached via various message boards, try ~S~ Seeker's messageboard . I usually drop in to read some posts there. Otherwise you could try dropping me a line at dose at remove-this at linux dot nl dot com. If that doesn't work - ask around.

cheers,


     _dose
     05/2002

B k:f l a n g e o f m y t h