Podobne
- Strona startowa
- Harry Eric L Strzec i bronic (SCAN dal 714)
- Harry Eric L Strzec i bronic
- Harry Eric L Strzec i bronic (2)
- Harry Eric L Strzec i bronic (3)
- Sasiedzi nazisci Eric Lichtblau
- Brust Steven Yendi (2)
- James P. D Z nienaturalnych przyczyn (SCAN
- Morrell Dav
- Samba (5)
- Ustawa o rachunkowoÂści
- zanotowane.pl
- doc.pisz.pl
- pdf.pisz.pl
- myszkuj.opx.pl
Cytat
Do celu tam się wysiada. Lec Stanisław Jerzy (pierw. de Tusch-Letz, 1909-1966)
A bogowie grają w kości i nie pytają wcale czy chcesz przyłączyć się do gry (. . . ) Bogowie kpią sobie z twojego poukładanego życia (. . . ) nie przejmują się zbytnio ani naszymi planami na przyszłość ani oczekiwaniami. Gdzieś we wszechświecie rzucają kości i przypadkiem wypada twoja kolej. I odtąd zwyciężyć lub przegrać - to tylko kwestia szczęścia. Borys Pasternak
Idąc po kurzych jajach nie podskakuj. Przysłowie szkockie
I Herkules nie poradzi przeciwko wielu.
Dialog półinteligentów równa się monologowi ćwierćinteligenta. Stanisław Jerzy Lec (pierw. de Tusch - Letz, 1909-1966)
[ Pobierz całość w formacie PDF ]
.This is absolutely not meant to be a substitute for a complete un-derstanding of such a deep subject.Instead, the goal is to highlight what elements ofnetwork design are crucial from the perspective of Hadoop deployment and perfor-mance.The following sections assume you re already familiar with basic networking conceptssuch as the OSI model, Ethernet standards such as 1- (1GbE) and 10-gigabit (10GbE),and the associated media types.Cursory knowledge of advanced topics such as routingtheory and at least one protocol such as IS-IS, OSPF, or BGP is helpful in getting themost out of Spine fabric on page 72.In the interest of simplicity, we don t coverbonded hosts or switch redundancy where it s obviously desirable.This isn t becauseit s not important, but because how you accomplish that tends to get into switch-specific features and vendor-supported options.66 | Chapter 4: Planning a Hadoop ClusterNetwork Usage in Hadoop: A ReviewHadoop was developed to exist and thrive in real-world network topologies.It doesn trequire any specialized hardware, nor does it employ exotic network protocols.It willrun equally well in both flat Layer 2 networks or routed Layer 3 environments.Whileit does attempt to minimize the movement of data around the network when runningMapReduce jobs, there are times when both HDFS and MapReduce generate consid-erable traffic.Rack topology information is used to make reasonable decisions aboutdata block placement and to assist in task scheduling, but it helps to understand thetraffic profiles exhibited by the software when planning your cluster network.HDFSIn Chapter 2, we covered the nuts and bolts of how HDFS works and why.Taking astep back and looking at the system from the perspective of the network, there are threeprimary forms of traffic: cluster housekeeping traffic such as datanode block reportsand heartbeats to the namenode, client metadata operations with the namenode, andblock data transfer.Basic heartbeats and administrative commands are infrequent andonly transfer small amounts of data in remote procedure calls.Only in extremely largecluster deployments on the order of thousands of machines does this traffic evenbecome noticeable.Most administrators will instead focus on dealing with the rate of data being read from,or written to, HDFS by client applications.Remember, when clients that execute on adatanode where the block data is stored perform read operations, the data is read fromthe local device, and when writing data, they write the first replica to the local device.This reduces a significant amount of network data transfer.Clients that do not run ona datanode or that read more than a single block of data will cause data to be transferredacross the network.Of course, with a traditional NAS device, for instance, all datamoves across the network, so anything HDFS can do to mitigate this is already animprovement, but it s nothing to scoff at.In fact, writing data from a noncollocatedclient causes the data to be passed over the network three times, two of which pass overthe core switch in a traditional tree network topology.This replication traffic moves inan East/West pattern rather than the more common client/server-oriented North/South.Significant East/West traffic is one of the ways Hadoop is different from manyother traditional systems.Network Design | 67North/South and East/West DistinctionsIf you re unfamiliar with the use of North/South and East/West in the context of net-work traffic, do not be afraid.This simply refers to the primary directional flow of trafficbetween two hosts on a network.Picture the network diagram of a typical tree network(see Figure 4-3).Traffic from clients typically flows from the top (or North) of thediagram to the bottom (South) and back (or vice versa it doesn t really matter).Agood example of this is hundreds or thousands of users of a web application; requestsinitiate from outside the network and flow in through the core, to the web applicationserver, and back out the way they came.The web application servers, for instance, nevercommunicate with one another (horizontally or East/West).Conversely, both HDFSand MapReduce exhibit strong East/West, or full node-to-node communication pat-terns.Some network topologies are better suited to North/South or East/West trafficpatterns, as we ll see in a bit.Beyond normal client interaction with HDFS, failures can also generate quite a bit oftraffic.Much simpler to visualize, consider what happens when a datanode that con-tains 24 TB of block data fails.The resultant replication traffic matches the amount ofdata contained on the datanode when it failed
[ Pobierz całość w formacie PDF ]