Quantcast
Channel: Bluewater SQL » Hadoop
Browsing all 48 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Installing HDInsight

It’s been a while since I’ve had the opportunity to blog so when I decided to install HDInsight on a VM, I figured what better opportunity to get back in the swing of it. The Jumping Off Point To get...

View Article



Image may be NSFW.
Clik here to view.

Installing Mahout for HDInsight on Windows Server

I am passionate when it comes to analytics, data mining and machine learning and I think most organizations do too little when it comes to this arena. That’s why one of my favorite parts of the Hadoop...

View Article

Image may be NSFW.
Clik here to view.

HIVE on HDInsight: First Glance

Hive Introduction Within the Hadoop ecosystem, you can use HDFS to load and store data and MapReduce to do both simple and hardcore processing. One of the missing pieces to the puzzle that is familiar...

View Article

Image may be NSFW.
Clik here to view.

Preparing Data for Hadoop

In my next couple of blog entries, I will be focusing on PIG and then MapReduce. Before that however, I need to prepare a dataset and get it loaded in HDFS. The data that I will be working with is...

View Article

Image may be NSFW.
Clik here to view.

When Pigs Fly: an apache pig introduction

In previous posts, we have looked at what it takes to get started with with Hadoop on Windows using HDInsight. We also looked at Hive, which is the data warehousing framework built on top of Hadoop. In...

View Article


Image may be NSFW.
Clik here to view.

Shakin’ Bacon: Using Pig To Process Data

In my last post (see HERE), I introduce the Apache Pig project and showed you the equivalent of the “Hello World” demo in Pig. In this post, we are going to use the GSOD (Global Summary of the Day)...

View Article

Image may be NSFW.
Clik here to view.

MMM More Bacon – Pig User-Defined Functions (UDFs)

Okay…okay…I know…the pig jokes are lame and getting old by now…maybe a picture of a kitten dressed like a Pig will cheer you up. Luckily this is the last of my introductory Pig posts before moving on...

View Article

Image may be NSFW.
Clik here to view.

Map/Reduce – A Brief Introduction

Somewhere between teaching a BI Bootcamp class and wrestling my troop of kids, I promised myself I would get a blog post in this week. Luckily, I’ve had a few code heavy posts, so we will dial it back...

View Article


Image may be NSFW.
Clik here to view.

MapReduce – First Glance

In my last post, we took a helicopter tour of the MapReduce framework and its many facets. I believe its important to have a functional understanding of MapReduce even if you never intend to never work...

View Article


Image may be NSFW.
Clik here to view.

MapReduce Ninja Moves: Combiners, Shuffle & Doing A Sort

Who’s driving this car? At first glance it appears that as a developer, you have very little if no control over how MapReduce behaves. In some regards this is an accurate assessment. You have no...

View Article

Image may be NSFW.
Clik here to view.

Hello My Name is Sqoop

If my previous post we have looked at different means and methods for loading and subsequently working with data in a Hadoop environment. Largely missing from the discussion to date however is how SQL...

View Article

Image may be NSFW.
Clik here to view.

Building a Mahout Recommendation Engine: Part 1 – Types of Recommenders

Recommendation Engines have become a pervasive and daily part of our digitally connected lives. Whether your shopping on Amazon or reading new articles on your Yahoo! home page the products and news...

View Article

Image may be NSFW.
Clik here to view.

#Mahout Recommendation Engines: Part 2 – Ride the Elephant

In Part 1 of this blog series we built a foundation by introducing the various techniques that can be used to generate recommendations for products or items to your users. In this post, we begin...

View Article


Image may be NSFW.
Clik here to view.

#Mahout recommendation Engines: Part 3 – Moving Data

In the previous two posts of this series we built a foundation for designing and building a recommendation engine. In the first post we  built an understanding for what a recommendation engine looks...

View Article

Image may be NSFW.
Clik here to view.

Introduction to #Hive Collections

After a much needed vacation in the sunny Florida Keys and some time away from the work and blogosphere world, its time to get back on the hamster wheel. Like most RDBMS systems Hive supports a number...

View Article


Image may be NSFW.
Clik here to view.

Partitions & Buckets in #Hive

In my previous post, we discussed the map, array and struct data types and their implementation in Hive. Continuing on the Hive theme, this post will introduce partitioning and bucketing as  method for...

View Article

Image may be NSFW.
Clik here to view.

Indexes & Views in #Hive

In my last Hive post, we introduced partitions and bucketing both of which allow you to horizontally slice data to make it more manageable and easy to query. Staying the course in this post we will...

View Article


Image may be NSFW.
Clik here to view.

Oink: Improving #Pig Development

Over the last couple (ok more than a couple) of months, we’ve taken a meandering stroll through the different parts and pieces that form the foundation of the Hadoop ecosystem. We’ve covered Hive,...

View Article

Image may be NSFW.
Clik here to view.

Streaming #Pig

As a C# developer there are a number of opportunities available for writing code that is either used by or interacts with a Hadoop/HDInsight cluster. A number of these have been well publicized and...

View Article

Image may be NSFW.
Clik here to view.

3 Little Piggy’s: Advanced #Pig Join Scenarios

One of the most common operations in any Pig job is the join. A join, much like what you like the ones you work with in SQL Server, brings together two sets of data into one. These joins can happen in...

View Article
Browsing all 48 articles
Browse latest View live




Latest Images