Fun with PowerShell: Part 1 – Loading Demo Data
I know I am relatively late to the game when it comes to PowerShell, but I’ve found myself using it more and more lately in places where I would traditionally write a bit of C# code. Yesterday as I...
View ArticleAn End-to-End HDInsight Demo
I’ve got one last post in me before vacation starts… By now, the big data buzzword fatigue has set in and you’ve probably completed 1,000 demos that highlight the different parts of the Hadoop...
View ArticleStreaming #Pig
As a C# developer there are a number of opportunities available for writing code that is either used by or interacts with a Hadoop/HDInsight cluster. A number of these have been well publicized and...
View Article3 Little Piggy’s: Advanced #Pig Join Scenarios
One of the most common operations in any Pig job is the join. A join, much like what you like the ones you work with in SQL Server, brings together two sets of data into one. These joins can happen in...
View ArticleHello…Azure Data Factory!
Overview The boundaries between on-premise and cloud-born data continue to blur with more and more organization moving to hybrid data landscapes. The blurring of these lines introduces a number of...
View ArticleSomething’s Brewing with Azure Data Factory Part 2
In my last post (HERE), I started hacking my way through the new Azure Data Factory service to automate my beer recommendation demo. The first post was all about setting up the necessary scaffolding...
View ArticleSomething’s Brewing with Azure Data Factory – Part 3
In the first two parts of this blog series (HERE and HERE), we used Azure Data Factory to load Beer review data from an Azure SQL Database to an Azure Blob Storage account. We then processed that data...
View ArticleOoooh I’m Telling : Doing Swear Word Analysis with Storm on HDInsight
As promised, this is the first of three (maybe more) posts that will present an end-to-end example to showcase the distributed streaming capabilities of the Apache Storm project. This first post will...
View ArticleGeospatial Queries Using Hive
During one recent engagement, I was helping my customer align ETL activities that were originally developed using SQL Server and T-SQL with the capabilities that were available using Hadoop and Hive....
View ArticleUsing #PolyBase in #SQLServer2016
It’s been a few weeks since the numerous Build and Ignite announcements ushered in the latest and greatest, SQL Server 2016. After having some time to soak it up (aka I’ve been too busy to blog) we...
View Article