Category Archives: Big Data

Facebook – 2 petabytes in a rack

Each disk in the cold storage gear can hold 4 terabytes of data, and each 2U system contains two levels of 15 disks. In other words, each unit can handle 120 terabytes. A rack could hold 16 of these storage systems, allowing for 2 petabytes of cold storage in a rack.

Source: http://www.datacenterknowledge.com/archives/2013/10/16/first-look-facebooks-oregon-cold-storage-facility/

You can read more at how to create this servers on

Hacking Conventional Computing Infrastructure » Open Compute 

www.opencompute.org/

By releasing Open Compute Project technologies as open hardware, our goal is to develop servers and data centers following the model traditionally associated 

Hadoop, MapReduce videos

Some nice videos that I’v found on famous youtube…
just posting them to watch them later this week 🙂

Sandy Ryza, of Cloudera, gives you a quick run-down of the basics of MapReduce: A programming abstraction that allows for parallel processing of massive data sets without the worries of distributed systems or fault tolerance.

He goes over how it works, some of the applications it’s best suited for, and how it integrates with Hadoop and Java.

Hadoop, Pivotal HD and HAWQ, some stolen paragraphs

HAWQ is a native, mature and fast SQL Query Engine for Hadoop.

HAWQ enables existing SQL skillsets on Hadoop with benefits.

  • Parallel Query Optimizer
  • Dynamic Pipelining
  • Pivotal Extension Frameworks
  • Advanced Analytics Functions

Read the full article 
http://www.gopivotal.com/pivotal-products/data/pivotal-hd#4

More stolen paragraphs (paragraphs, images and diagrams)!!

Continue reading Hadoop, Pivotal HD and HAWQ, some stolen paragraphs