<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>HBase Explorer</title>
	<atom:link href="http://hbaseexplorer.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://hbaseexplorer.wordpress.com</link>
	<description>Exploring BigData processing</description>
	<lastBuildDate>Fri, 30 Dec 2011 21:01:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='hbaseexplorer.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>HBase Explorer</title>
		<link>http://hbaseexplorer.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://hbaseexplorer.wordpress.com/osd.xml" title="HBase Explorer" />
	<atom:link rel='hub' href='http://hbaseexplorer.wordpress.com/?pushpress=hub'/>
		<item>
		<title>From ETL to Realtime Map-Reduce</title>
		<link>http://hbaseexplorer.wordpress.com/2011/12/30/from-etl-to-realtime-map-reduce/</link>
		<comments>http://hbaseexplorer.wordpress.com/2011/12/30/from-etl-to-realtime-map-reduce/#comments</comments>
		<pubDate>Fri, 30 Dec 2011 21:01:43 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Hbase]]></category>

		<guid isPermaLink="false">http://hbaseexplorer.wordpress.com/?p=132</guid>
		<description><![CDATA[As many, you may have come across the enlightenment, that running M/R jobs is not really an ad-hoc adventure. Not sure how such an illusion has come up, after all, the way we process data has changed little, despite the fact that it is much more data now. I want to position some hype&#8217;d terms [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=132&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As many, you may have come across the enlightenment, that running M/R jobs is not really an ad-hoc adventure. Not sure how such an illusion has come up, after all, the way we process data has changed little, despite the fact that it is much more data now. I want to position some hype&#8217;d terms here into the big picture here, following with some tips on doing real time data processing with Hbase in later posts.</p>
<p>Mostly you will do these things with your big data:</p>
<ol>
<li>keep it for reference (perhaps your bills),</li>
<li>collect it first, then dive into it for analysis (&#8220;Where do my customers actually come from?&#8221;, aka <a href="http://de.wikipedia.org/wiki/Online_Analytical_Processing">OLAP</a>)</li>
<li>or apply some algorithm on it and derive business decision values, possibly real-time (&#8220;customers that bought A also bought B&#8230;&#8221; or &#8220;Current Top-Tweeds by Region&#8221;, call it  <a href="http://de.wikipedia.org/wiki/OLTP">OLTP </a>or <a href="http://en.wikipedia.org/wiki/Complex_event_processing">CEP</a>)</li>
</ol>
<p>If you already know your data and also have the question that you need answers for, you can go straight to data processing. Otherwise you may need to make the collected data somehow readable first. We used to call that <a href="http://en.wikipedia.org/wiki/Extract,_transform,_load">ETL</a>, these days also M/R is doing well here with the help of  <a title="Hive" href="http://hive.apache.org/">Hive</a>, <a title="Pig" href="pig.apache.org">Pig </a>or <a title="Cascading" href="http://www.cascading.org/">Cascading</a>.  Finally there are <a href="http://www.oracle.com/technetwork/database/options/olap/index.html">many</a> <a href="http://www.vertica.com/2008/07/07/understanding-the-difference-between-column-stores-and-olap-data-cubes/">great</a> <a href="http://www.teradata.com/Teradata-Analytical-Ecosystem/">tools</a> out there to investigate such data, some place it into analysis cubes to make that work a bit more handy for non programming analysts.</p>
<p><a href="http://hbaseexplorer.files.wordpress.com/2011/12/etl_processing2.png"><img class="aligncenter size-full wp-image-148" title="etl_processing" src="http://hbaseexplorer.files.wordpress.com/2011/12/etl_processing2.png?w=600&#038;h=255" alt="" width="600" height="255" /></a><a href="http://hbaseexplorer.files.wordpress.com/2011/12/etl_processing1.png"><br />
</a>Once you know what you are looking for, you can decide <em>how current</em> your answers should be. You can stick with the ETL-to-Cube-Approach if it is enough to look at these answers once in a week or so. Or you automate and improve your ETL process further (here a <a href="http://www.vertica.com/the-analytics-platform/native-bi-etl-and-hadoop-mapreduce-integration/">M/R approach simplifies scaling</a>). Or you look at your incoming data as a stream of events and rebuild the ETL logic to operate real-time. As the &#8220;Load&#8221;-Part of ETL is obsolete, I replace it here with a &#8220;P&#8221; for &#8220;Processing:</p>
<p><a href="http://hbaseexplorer.files.wordpress.com/2011/12/etp_processing.png"><img class="aligncenter size-full wp-image-146" title="etp_processing" src="http://hbaseexplorer.files.wordpress.com/2011/12/etp_processing.png?w=600&#038;h=238" alt="" width="600" height="238" /></a>There are a few challenges in doing that ETP work real-time:</p>
<ul>
<li>turning aggregations into incremental aggregations (&#8220;select sum() over a week&#8221;  may become &#8220;increment X of thisWeek&#8221;)</li>
<li>Keeping a calculation context over a longer period (&#8220;if X happened 5 hours before Y then&#8230;&#8221;).</li>
<li>Handling of unique-value aggregations (&#8220;how many unique visitors do I have over a week&#8230;&#8221;)</li>
<li>You may need more CPU cycles and overall I/O as you can not benefit from batch processing of the classic ETL tools</li>
<li>Synchronization: if your data arrives through different channels and you want to manipulate shared data (such as an index perhaps)</li>
<li>Your Business Analyst may still need some kind of good old Analysis &#8211; these tools want to be loaded, and thus you may keep some kind of ETL alive. In other words, you probably have to add code and computers, you can not simply reuse what you have.</li>
</ul>
<p>The good thing is, that the new, highly scalable key-value stores help to implement that with rather simple patterns (that indeed often look quite similar to M/R algorithms, so it may justify the term &#8220;Realtime&#8221;). My preferred toy is <a href="http://hbase.org">HBase</a>, but most of them can be implemented also with <a href="http://cassandra.apache.org">cassandra</a>, <a href="http://hypertable.org/">hypertable</a>, <a href="http://redis.io/">redis</a> or <a href="http://fallabs.com/tokyocabinet/">tokyo cabinet</a>.</p>
<p><em>Read about some simple patterns that seem to occur repeatedly in the ETP stage in the next post.</em></p>
<p style="text-align:left;">It&#8217;s not complicated to build your own code to Extract your Data, do whatever Transformation and Process it, possibly even using data that is already in your data store.  For HBase check the Avro, Thrift or ProtocolBuffers to conviniently talk to HBase using complex domain objects. To be scalable, you generally want to avoid complex synchronizations. So some are out there to help you on that task. I guess <a title="wibidata" href="http://wibidata.com">WibiData </a>offers something more ready-to use.</p>
<p style="text-align:left;">Going further,  <a href="http://s4.io/">some</a> <a href="http://www.theregister.co.uk/2010/09/24/google_percolator/">Ideas</a> are out there to chain several processing steps in a map-reduce manner, although it seems rather complex to use and to configure.  Anyway call it  realtime Map-Reduce.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/132/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/132/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/132/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=132&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2011/12/30/from-etl-to-realtime-map-reduce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>

		<media:content url="http://hbaseexplorer.files.wordpress.com/2011/12/etl_processing2.png" medium="image">
			<media:title type="html">etl_processing</media:title>
		</media:content>

		<media:content url="http://hbaseexplorer.files.wordpress.com/2011/12/etp_processing.png" medium="image">
			<media:title type="html">etp_processing</media:title>
		</media:content>
	</item>
		<item>
		<title>Next Munic OpenHUG 25.November!</title>
		<link>http://hbaseexplorer.wordpress.com/2010/11/16/next-munic-openhug-25-november/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/11/16/next-munic-openhug-25-november/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 11:36:12 +0000</pubDate>
		<dc:creator>Al</dc:creator>
		
		<guid isPermaLink="false">http://hbaseexplorer.wordpress.com/?p=129</guid>
		<description><![CDATA[Skytec AG Keltenring 11 Oberhaching http://events.linkedin.com/Munich-OpenHUG-Meeting/pub/477768<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=129&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Skytec AG<br />
Keltenring 11<br />
Oberhaching </p>
<p>http://events.linkedin.com/Munich-OpenHUG-Meeting/pub/477768</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/129/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/129/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/129/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=129&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/11/16/next-munic-openhug-25-november/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
		<item>
		<title>3rd Munich OpenHUG on May 6</title>
		<link>http://hbaseexplorer.wordpress.com/2010/04/27/3rd-munich-openhug-on-may-6/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/04/27/3rd-munich-openhug-on-may-6/#comments</comments>
		<pubDate>Tue, 27 Apr 2010 09:40:58 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Munich OpenHUG]]></category>

		<guid isPermaLink="false">http://hbaseexplorer.wordpress.com/?p=119</guid>
		<description><![CDATA[We are inviting again to discuss NoSQL and BigData matters. Stefan Seelmann will show us an example of a real world integration of the two top class (and top-level now)-Projects. You are welcome to bring a short presentation too. When: 6. May 2010, 18:00, open end. We may get something to eat &#38; drink from [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=119&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We are inviting again to discuss NoSQL and BigData matters. <a href="http://stefan-seelmann.de/index.php?/categories/1-Apache-Directory">Stefan Seelmann</a> will show us an example of a real world integration of the two top class (and top-level now)-Projects. You are welcome to bring a short presentation too.</p>
<p><strong>When</strong>: 6. May 2010, 18:00, open end. We may get something to eat &amp; drink from the pizza shop around the corner.</p>
<p><strong>Place</strong>: eCircle GmbH, 80686 München, Nymphenburger Str. 86.</p>
<p>cu!</p>
<iframe width="640" height="480" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.de/maps?q=ecircle AG münchen&amp;hl=de&amp;cd=1&amp;ei=sK_WS-zJIcWe_gb7jYX2DQ&amp;sig2=rKa_ip2_xxB57JhCOQXP5g&amp;sll=48.143067,11.563711&amp;sspn=0.034162,0.069793&amp;ie=UTF8&amp;view=map&amp;cid=2018548909850998850&amp;ved=0CCsQpQY&amp;hnear=&amp;ll=48.150626,11.547382&amp;spn=0.006872,0.013733&amp;z=16&amp;iwloc=A&amp;output=embed"></iframe><br /><small><a href="http://maps.google.de/maps?q=ecircle AG münchen&amp;hl=de&amp;cd=1&amp;ei=sK_WS-zJIcWe_gb7jYX2DQ&amp;sig2=rKa_ip2_xxB57JhCOQXP5g&amp;sll=48.143067,11.563711&amp;sspn=0.034162,0.069793&amp;ie=UTF8&amp;view=map&amp;cid=2018548909850998850&amp;ved=0CCsQpQY&amp;hnear=&amp;ll=48.150626,11.547382&amp;spn=0.006872,0.013733&amp;z=16&amp;iwloc=A&amp;source=embed" style="text-align:left">View Larger Map</a></small>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/119/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=119&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/04/27/3rd-munich-openhug-on-may-6/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
		<item>
		<title>Next Meeting 6. May 2010</title>
		<link>http://hbaseexplorer.wordpress.com/2010/03/15/next-meeting-6-may-2010/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/03/15/next-meeting-6-may-2010/#comments</comments>
		<pubDate>Mon, 15 Mar 2010 07:47:50 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Munich OpenHUG]]></category>

		<guid isPermaLink="false">http://hbaseexplorer.wordpress.com/?p=115</guid>
		<description><![CDATA[We set the date for the next meeting now. Its four weeks before the Berlin Meeting, which is in parallel to THE CONFERENCE , so a perfect time to get the right questions together, that can be discussed there. I&#8217;ll post more about the talks later.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=115&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We set the date for the next meeting now. Its four weeks before the Berlin Meeting, which is in parallel to THE <a title="Berlin Buzzwords" href="http://berlinbuzzwords.de">CONFERENCE </a>, so a perfect time to get the right questions together, that can be discussed there.</p>
<p>I&#8217;ll post more about the talks later.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/115/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=115&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/03/15/next-meeting-6-may-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
		<item>
		<title>Hbase Patterns Talk @ Hadoop Meeting in Berlin</title>
		<link>http://hbaseexplorer.wordpress.com/2010/03/13/hbase-patterns-talk-hadoop-meeting-in-berlin/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/03/13/hbase-patterns-talk-hadoop-meeting-in-berlin/#comments</comments>
		<pubDate>Sat, 13 Mar 2010 16:55:32 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Hbase]]></category>

		<guid isPermaLink="false">http://hbaseexplorer.wordpress.com/?p=105</guid>
		<description><![CDATA[I had a chance to talk about the Patterns we regularly use at eCircle when building applications with HBase. Here is a summary from Isabel, and these are the slides.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=105&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I had a chance to talk about the Patterns we regularly use at <a href="http://ecircle.com">eCircle </a>when building applications with HBase. Here is a summary from <a title="HBase Patterns / Bob Schulze " href="http://blog.isabel-drost.de/index.php/archives/167/apache-hadoop-get-together-march-2010">Isabel</a>, and these are the <a title="Slides HBase Patterns" href="http://hbaseexplorer.files.wordpress.com/2010/03/bln2010hbasepatterns.pdf">slides</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/105/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=105&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/03/13/hbase-patterns-talk-hadoop-meeting-in-berlin/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
		<item>
		<title>2nd Munich OpenHUG</title>
		<link>http://hbaseexplorer.wordpress.com/2010/02/26/2nd-munich-openhug/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/02/26/2nd-munich-openhug/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 12:52:32 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Munich OpenHUG]]></category>

		<guid isPermaLink="false">http://althelies.wordpress.com/?p=81</guid>
		<description><![CDATA[Christoph Rupp started with a introduction to his embeddable HamsterDB, a nice and light alternative for BerkeleyDB. Beside a comprehensive feature set, he impressed with strong quality rules that are applied to each release.  A embedded fast DB is certainly a different world than the massive scaling we do with Hadoop &#38; Co, but why [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=81&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Christoph Rupp started with a introduction to his embeddable <a title="HamsterDB" href="http://hamsterdb.com/">HamsterDB</a>, a nice and light alternative for <a title="BerkelyDB" href="http://www.oracle.com/technology/products/berkeley-db/index.html">BerkeleyDB</a>. Beside a comprehensive feature set, he impressed with strong quality rules that are applied to each release.  A embedded fast DB is certainly a different world than the massive scaling we do with Hadoop &amp; Co, but why not imagine the HBase Shards using a really fast embedded piece of C-Code?</p>
<p>Afterwards we had some good discussions in different directions, spanning from Hadoop* up to all kinds of cloud challenges.</p>
<p>This was our <a href="http://www.larsgeorge.com/2010/01/first-munich-openhug-meeting-summary.html">second </a>meeting and I still assume there are more data-fighters out there in the munich area. Spread the word!</p>
<p>Looking forward to he next Open Hadoop Users Group Meeting, probably mid-May 2010.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/81/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=81&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/02/26/2nd-munich-openhug/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
		<item>
		<title>Where to put my persistence layer?</title>
		<link>http://hbaseexplorer.wordpress.com/2010/01/19/where-to-put-my-persistence-layer/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/01/19/where-to-put-my-persistence-layer/#comments</comments>
		<pubDate>Tue, 19 Jan 2010 20:56:54 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[Hbase]]></category>
		<category><![CDATA[DAO]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://althelies.wordpress.com/?p=43</guid>
		<description><![CDATA[After you play a while with the put()&#8217;s and get()&#8217;s of your Hbase client, you&#8217;ll probably start to think about how to organize the mess. When a classic DAO&#8230; This classic approach with one backend database ensures that the client code needs no knowledge of the database and also not of the actual persistance strategies [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=43&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><em>After you play a while with the put()&#8217;s and get()&#8217;s of your Hbase client, you&#8217;ll probably start to think about how to organize the mess.</em></p>
<h2>When a classic DAO&#8230;</h2>
<p><a href="http://hbaseexplorer.files.wordpress.com/2010/01/dao11.gif"><img src="http://hbaseexplorer.files.wordpress.com/2010/01/dao11.gif?w=600" alt="Classic DAO" title="dao1"   class="aligncenter size-full wp-image-58" /></a></p>
<p>This classic approach with one backend database ensures that the client code needs no knowledge of the database and also not of the actual persistance strategies in the Persistance Layer (which was called <em>D</em>ata<em>A</em>ccess<em>O</em>bject some years ago). Modern software can even drive several database models just by different configurations. Proper interfaces allow a independend testing of all layers.</p>
<p>The Database itself ensures the data consistency and has a strong API (SQL).</p>
<p>All thats nice, but now you want to manage <em>a lot</em> of data, and you may choose HBase to do this.</p>
<h2>&#8230;gets distributed</h2>
<p>When using a distributed, shared-nothing database, there is -by definition- no single point that manages the persistance strategies, such as</p>
<ul>
<li>Ensuring Consistency (was the job of the RDBMS)</li>
<li>Transactions (RDBMS)</li>
<li>Maintain Indexes (RDBMS and your admin)</li>
<li><a href="http://en.wikipedia.org/wiki/Create,_read,_update_and_delete">CRUD</a> into/from several Tables (classically a job of the Persistance Layer)</li>
<li>Caches and Buffers to heal short Database Hickups</li>
<li>Handle Security (Code or Database)</li>
</ul>
<p>For a first attempt, just let a new persistance layer for the distributed database do all this:</p>
<p><a href="http://hbaseexplorer.files.wordpress.com/2010/01/dao2.gif"><img src="http://hbaseexplorer.files.wordpress.com/2010/01/dao2.gif?w=600" alt="" title="Try to Access a DSN DB the classic way"   class="aligncenter size-full wp-image-60" /></a></p>
<p>(In a shared-nothing Database, as <a href="http://en.wikipedia.org/wiki/Shard_(database_architecture)">Shard</a> holds a fragment of your data row set. A Master typically somehow knows of everything, but has no data.)</p>
<p>But how about consistency? We would have to update all clients in the same time to make sure, everybody properly maintains the index, adheres to the new security rules etc. Also, if we have more than one application, this gets complicated. Imagine 24&#215;7.</p>
<p>So changes on the persistance layer require a immediated distribution to all clients that use it. This is new and quite different to classical RDBMS: &#8220;Lets put an index&#8221; is not so simple anymore. There are several options to get the persistance code distributed:</p>
<table border="0">
<tr>
<td>A</td>
<td>shutdown clients, redeploy, restart</td>
<td>Normally not possible, also there is the risk of forgetting a client</td>
</tr>
<tr>
<td>B</td>
<td>build some schema version checking, let clients check the DAO Version for any access, Reload the DAO Code dynamically</td>
<td>While loading code dynamically is really cool, your QA Department will probably not like it. You need good security measures as well&#8230;.</td>
</tr>
<tr>
<td>C</td>
<td>Have a additional layer of servers that act as DAO Layer</td>
<td>This seems to promise a solution to many problems at the cost of that additional servers.</td>
</tr>
</table>
<h2>A separate set of DAO servers</h2>
<p>So if you have the money for some additional servers, this is the way to go. It offers solutions to all the problems mentioned above.</p>
<p><a href="http://hbaseexplorer.files.wordpress.com/2010/01/dao3.gif"><img src="http://hbaseexplorer.files.wordpress.com/2010/01/dao3.gif?w=600" alt="" title="Replace the RDBMS with a whole cluster"   class="aligncenter size-full wp-image-61" /></a></p>
<p>Performance might be a problem. The beauty of shared-nothing comes from the independend life that each thread in your business logic can live. If you query Google, you might -in that moment- well have 20 computers available, just for your single request. This additional layer should scale at the same rate than the other IO streams of your application, possibly in a one-to-one relation as shown in the picture above. If you have some reasonaable work to do in your DAO Layer (such as encryping some fields, or calculating hashes for indexes), this computing power is not only additional cost, it frees your business layer from that.</p>
<p>So you have your separate DAO. How to update them now??? Restarting all the same time is also a short downtime. So here you are challanged to write code that allows you to do the mentioned things at runtime, such as changing  permission settings or adding index rules. After all, these servers can also hold database maintainance code as well.</p>
<p>You might want to use a load-balancing between the Client and the DAO Nodes, which gives you the additional benefit of scaling and replacing nodes at runtime. The DAO Nodes may well buffer calls or run them in a multithreaded fashion, to give better reponse time from the database to the clients. A Firewall can offer addtional safety in your datacenter.</p>
<p>With all this freedom, dont forget that such a design does still not offer many things you may have got used to from traditional RDBM&#8217;s &#8211; unless you&#8217;d put them into your DAO Layer code yourself (if you are a hard-core database expert). But you may not need many of these things &#8211; and if the scaling benefits play out the potential loss of precision and accuracy &#8211; the data storage will never limit your business anymore.</p>
<p>&#8211;Al</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/43/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/43/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/43/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=43&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/01/19/where-to-put-my-persistence-layer/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>

		<media:content url="http://hbaseexplorer.files.wordpress.com/2010/01/dao11.gif" medium="image">
			<media:title type="html">dao1</media:title>
		</media:content>

		<media:content url="http://hbaseexplorer.files.wordpress.com/2010/01/dao2.gif" medium="image">
			<media:title type="html">Try to Access a DSN DB the classic way</media:title>
		</media:content>

		<media:content url="http://hbaseexplorer.files.wordpress.com/2010/01/dao3.gif" medium="image">
			<media:title type="html">Replace the RDBMS with a whole cluster</media:title>
		</media:content>
	</item>
		<item>
		<title>hbasexplorer on sourceforge</title>
		<link>http://hbaseexplorer.wordpress.com/2010/01/15/hbasexplorer-now-public/</link>
		<comments>http://hbaseexplorer.wordpress.com/2010/01/15/hbasexplorer-now-public/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 22:27:07 +0000</pubDate>
		<dc:creator>Al</dc:creator>
				<category><![CDATA[HbaseExplorer]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://althelies.wordpress.com/?p=9</guid>
		<description><![CDATA[Dont you know about Hbase ? Ist something that finishes with the Idea of traditional databases. Altough it does not completely deny the good things that RDBMS invented, it gets you to a new level of data storage. If you are willing to change. The idea is well (but somewhat complicated) described in Google&#8217;s papers, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=9&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Dont you know about <a href="http://hadoop.apache.org/hbase">Hbase</a> ? Ist something that finishes with the Idea of traditional databases. Altough it does not completely deny the good things that RDBMS invented, it gets you to a new level of data storage. If you are willing to change. The idea is well (but somewhat complicated) described in <a href="http://labs.google.com/papers/bigtable.html">Google&#8217;s papers</a>, great folks followed that idea and put it into a public, really working piece of software.</p>
<p>Shared-nothing rules. Try it! Search something at <a href="http://google.com">Google</a>. Or <a href="http://www.bing.com">Bing</a>.</p>
<p>I want to make it a bit more handy by starting a tool that can visualize the data in timestamp order, providing inter-linking between the data and more. I think this is the real revolutionary idea of BigTable: dont look at the data only as a snapshot in time, but with the idea of history (and future) in mind.</p>
<p>Read more <a href="http://althelies.wordpress.com/hbaseexplorer/">here</a>.</p>
<p>&#8211;Al</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hbaseexplorer.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hbaseexplorer.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hbaseexplorer.wordpress.com/9/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=hbaseexplorer.wordpress.com&amp;blog=12590399&amp;post=9&amp;subd=hbaseexplorer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://hbaseexplorer.wordpress.com/2010/01/15/hbasexplorer-now-public/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b6637ada9326729c277528d2ea3711ac?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Al</media:title>
		</media:content>
	</item>
	</channel>
</rss>
