123
 123

Tip: 看不到本站引用 Flickr 的图片? 下载 Firefox Access Flickr 插件 | AD: 订阅 DBA notes --

2012-01-20 Fri

17:06 Reinforcing Couchbase's Commitment to Open Source (1351 Bytes) » myNoSQL
Reinforcing Couchbase's Commitment to Open Source:

Bob Wiederhol, the CEO of Couchbase:

We’re 100% committed to open source and all of our code is available under the Apache 2.0 license.

Is.

Original title and link: Reinforcing Couchbase’s Commitment to Open Source (NoSQL database©myNoSQL)

16:39 The State of NoSQL in 2012 (1707 Bytes) » myNoSQL
The State of NoSQL in 2012:

Wise words from Sid Anand:

Many of the NoSQL vendors view the “battle of NoSQL” to be akin to the RDBMS battle of the 80s, a winner-take-all battle. In the NoSQL world, it is by no means a winner-take-all battle. Distributed Systems are about compromises.

While there might be some that would like to see a NoSQL battle and at some point money will talk, I hope the real battle will remained centered around the technical aspects and which data solutions solve each specific problem better. The sort of battle in which everyone learns something.

Original title and link: The State of NoSQL in 2012 (NoSQL database©myNoSQL)

14:00 Log Buffer #255, A Carnival of the Vanities for DBAs (396 Bytes) » The Pythian Blog
With winter and its cold weather starting to set in across most of the world, now is the time when travelers start to think about warming things up. For most, that means flying to hot and sunny destinations. Another way of looking at it, however, is to head out for a different kind of sizzle. [...]
01:01 Auto Scaling in the Amazon Cloud: Netflix's Approach and Lessons Learned (1700 Bytes) » myNoSQL
Auto Scaling in the Amazon Cloud: Netflix's Approach and Lessons Learned:

Another great post for today from the engineering team at Netflix:

Auto scaling is a very powerful tool, but it can also be a double-edged sword. Without the proper configuration and testing it can do more harm than good. A number of edge cases may occur when attempting to optimize or make the configuration more complex. As seen above, when configured carefully and correctly, auto scaling can increase availability while simultaneously decreasing overall costs.

Original title and link: Auto Scaling in the Amazon Cloud: Netflix’s Approach and Lessons Learned (NoSQL database©myNoSQL)

00:49 CouchDB: A Season Finale (6258 Bytes) » myNoSQL

There was a story earlier this year that I, as someone that has spent an enormous amount of time contributing to open source projects, thought it was no story. Considering how much was published about it, chances were you already read something about Damien Katz’s The future of CouchDB.

At the time of that post, my draft looked like this:

And now I, and the Couchbase team, are mostly moving on. It’s not that we think CouchDB isn’t awesome. It’s that we are creating the successor to it: Couchbase Server. A product and project with similar capabilities and goals, but more faster, more scalable, more customer and developer focused. And definitely not part of Apache.

Elvis has left the building. Please welcome The Beatles!

I always thought that some sort of a message from the its creator was needed to completely clear the waters about CouchDB. Damien’s post together with the earlier post from Couchbase announcing the disconuation of the Couchbase Single Server (Couchbase’s CouchDB distribution) were bringing closure to the CouchDB saga. And that was good.

I knew that the Apache CouchDB project and community are doing fine. Noah Slater’s email just confirmed that:

As some of you may have already read, Damien Katz, Apache CouchDB’s original developer, has publicly announced that he intends to focus his time exclusively on developing other products for his company. Damien has had very little involvement in the CouchDB project for a year or more now, so, for many people, this is confirmation of what they already knew. […]

Our biggest strength has always been the breadth and depth of our community of developers and users. In the very near future, we’ll be voting in a new committer, appointing a new PMC member, sprucing up the website, and making a major new release

Late last year, I also suggested that Cloudant would become the go to company for CouchDB. Adam Kocoloski’s post confirmed this too:

We, along with a host of other companies, strongly support the open source community in building CouchDB and we do not plan on stopping. We have been fortunate in our ability to attract outstanding engineers, investors, and customers. We intend to continue devoting resources to Apache CouchDB and offer our help in any way the community desires.

While I could understand some of the criticisms[1], my conclusion was pretty close to what Bradley Holt wrote:

Going forward, you’ll have two choices, either Apache CouchDB or Couchbase Server. The road map for Apache CouchDB will continue to be determined by community consensus. The road map for Couchbase Server will be determined by Couchbase, the company.

But I was left with a nagging feeling that I missed something. I kept on circling around a small part of the original post:

What’s the future of CouchDB? It’s Couchbase.

How could a product that is removing defining features (e.g. the HTTP RESTful API or the peer-to-peer replication), that is already different (Volker Mische’s post provides details), and that offers no clear migration path be the future of CouchDB?

The answer is actually simpler than I thought:

Couchbase is the future of CouchDB as CouchDB was the future of Lotus Notes. A new product that takes inspiration from the experience and lessons learned while building the previous one.

And that was a CouchDB season finale. I’m already looking forward to the next season’s plots.

Original title and link: CouchDB: A Season Finale (NoSQL database©myNoSQL)

2012-01-19 Thu

23:50 《Oracle DBA手记》以及51CTO 年度图书作者 (1766 Bytes) » Oracle Life

作者:eygle 发布在 eygle.com

从出版社获得的消息,在51CTO的年度评选中,再次被评选为年度最受读者喜爱的作者奖。

感谢 51CTO 网站多年以来坚持不懈的图书评选工作,这一评选对于支持图书原创、图书写作来说,具有相当的影响力和支持作用。

感谢出版社,也感谢读者们的支持和厚爱,谢谢,祝大家龙年快乐!

51CTO2011.jpg

相关文章|Related Articles

评论数量(0)|Add Comments

本文网址:

19:39 RainStor Big Data Analytics on Hadoop Promises Impressive Data Compression Rates (4241 Bytes) » myNoSQL

RainStor has announced the Big Data Analytics on Hadoop:

  • The highest data compression in the industry with up to 40x reduction, compared to raw data typically stored in HDFS, with no re-inflation required for access
  • The ability to run faster query and analysis using both SQL query and MapReduce with 10-100x faster results
  • The ability to perform analytics directly in Hadoop, reducing the need to create copies and transfer data out
  • Reduced nodes in a Hadoop cluster with ~85 percent lower operating costs.

A couple of comments:

  • RainStor is not the only solution that can perform analytics directly in Hadoop
  • Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL
  • RainStor MapReduce support is via Pig
  • according to this, there’s an interesting aspect of RainStor support of SQL and MapReduce:

    Users can choose SQL for rapid response ad-hoc queries or run batch jobs using MapReduce against RainStor data.  Additionally you can interoperate SQL and MapReduce and join results from a query against RainStor and against native CSV files on HDFS.

    As a side note, Toad for Cloud from Quest is a tool that tries to provide a table based perspective of data in relational and NoSQL databases

Anyways, the most interesting part of the announcement is RainStor’s claimed data compression level (up to 40x) and the fact that accessing data doesn’t require re-inflation. According to an infographic the current available solutions for compression are topped at at most 8x:

  • Hadoop LZO: 3x
  • Compressed relational: 6x
  • Flatfile Gzip: 7x
  • Columnar: 8x

If such compression levels can be achieved frequently and the impact on other server resources (CPU, memory) is minimal, RainStor Big Data Analytics on Hadoop will definitely be an interesting part of the Hadoop market.

Before leaving you with the infographic, here is a nice quote form RainStor CEO, John Bantleman:

We see Hadoop as a platform like Linux, which needs solutions on top to deliver value.

Hadoop Data Compression

Original title and link: RainStor Big Data Analytics on Hadoop Promises Impressive Data Compression Rates (NoSQL database©myNoSQL)

17:00 Using MongoDB Replica Sets With Node.js on Microsoft Azure: NoSQL Tutorials (1993 Bytes) » myNoSQL
Using MongoDB Replica Sets With Node.js on Microsoft Azure: NoSQL Tutorials:

Mariano Vazquez explains how to configure MongoDB replica sets on Microsoft Azure and how that works:

  • MongoDB will run the native binaries on a worker role and will store the data in Windows Azure storage using Windows Azure Drive (basically a hard disk mounted on Azure Page blobs)
  • The good thing about using Azure Storage is that the data is georeplicated. It will also make backup easier because of the snapshot feature of blob storage (which is not a copy but a diff).
  • It will use the local hard disk in the VM (local resources in the Azure jargon) to store the log files and a local cache.
  • You can scale out to multiple Mongo Replica Sets by increasing the instance count of the MongoDB role

Original title and link: Using MongoDB Replica Sets With Node.js on Microsoft Azure: NoSQL Tutorials (NoSQL database©myNoSQL)

16:53 Pros and Cons of Using MapReduce With Distributed Key-Value Stores: HBase, Cassandra, Riak (2038 Bytes) » myNoSQL

Old Quora question with very good answers.

  • (pro) can (potentially) query live data
  • (pro) can (conceptually) be highly efficient at joining data sets that are identically sharded on the join key (the joins can be pushed down into the key-value store itself)
  • (con) full scans (the most common pattern for map-reduce) is most likely to be much faster with raw file system access
  • (con) because of the better decoupling of computation and storage in the GFS+Mao-Reduce model - tolerating hot spots (resulting from MR jobs) is much easier
  • (con) key-value stores are rarely arranged to have schemas optimized for analytics

Naoki Yanai

Original title and link: Pros and Cons of Using MapReduce With Distributed Key-Value Stores: HBase, Cassandra, Riak (NoSQL database©myNoSQL)

16:51 Quiz Night (1 Bytes) » Oracle Scratchpad
A
00:03 Notes About Amazon DynamoDB » myNoSQL

2012-01-18 Wed