123
 123

Tip: 看不到本站引用 Flickr 的图片? 下载 Firefox Access Flickr 插件 | AD: 订阅 DBA notes --

2012-01-18 Wed

22:13 Amazon DynamoDB - a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications (1661 Bytes) » myNoSQL
Amazon DynamoDB - a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications:

Werner Vogels:

Today is a very exciting day as we release Amazon DynamoDB, a fast, highly reliable and cost-effective NoSQL database service designed for internet scale applications. DynamoDB is the result of 15 years of learning in the areas of large scale non-relational databases and cloud services.

No words can describe (yet) the magnitude of this announcement.

Original title and link: Amazon DynamoDB - a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications (NoSQL database©myNoSQL)

22:07 The Little Known Secret of Redis (2262 Bytes) » myNoSQL

It’s the redis-benchmark. In Salvatore’s words:

a little known feature of redis-benchmark introduced recently by Pieter Noordhuis is that you can benchmark any command you want (only available in Redis unstable branch, but you can use redis-benchmark from unstable to benchmark Redis 2.4.x):

 $ ./redis-benchmark -q -n 100000 zadd sortedset 10 a 
 zadd sortedset 10 a: 152439.02 requests per second

To compare it with another command:

 $ ./redis-benchmark -q -n 100000 set foo bar 
 set foo bar: 153374.23 requests per second

However you can say that we are setting again and again the same element. True… but there is an hidden feature of redis-benchmark that allows to randomize arguments:

 $ ./redis-benchmark -q -r 100000 -n 100000 zadd sortedset 10 ele:rand:000000000000
 zadd sortedset 10 ele:rand:000000000000: 104166.67 requests per second 
 $ redis-cli zcard sortedset
 (integer) 63202

The same thread also covers the performance of sorted sets in Redis.

Original title and link: The Little Known Secret of Redis (NoSQL database©myNoSQL)

19:34 MongoDB at Viber Media: The Platform Enabling Free Phone Calls and Text Messaging for Over 18 Million Active Users (5586 Bytes) » myNoSQL

Back in November there has been quite a bit of buzz around MongoDB being behind Viber Media’s technology for free phone calls and text messaging. Understandingly so, considering we are talking about a platform with more than 18 million active users talking for more than 11 million minutes every day—and these numbers have probably grown quite a bit over the holiday season.

The nice folks from Viber Media[1] have been kind enough to share more details about their platform and the way MongoDB is used. Here is the complete exchange:

Q: Could you briefly describe how your application works so we could better understand where MongoDB fit into your architecture?

Viber’s mobile clients connect to a central service that can route messages to other such clients. These messages can either be text messages or “signals” for establishing a phone call. These front-end servers use MongoDB as a common data-store. We store variable length documents that include dictionaries.

Q: What were the main reasons that led you to use MongoDB? Were there other solutions that you’ve been tempted to use for your architecture?

We started with a proprietary code, but with the large increase in the number of new registrations per day, we realized that we needed a database that will be both scalable and redundant. At that time, this was the only database that looked like a good fit for both.

Q: The announcement mentioned that currently your clusters run on 130 nodes in the Amazon cloud. Could you describe the deployment and what components of the Amazon cloud are involved?

We have 65 MongoDB shards. Each shard consists of a master and a slave. A single EC2 instance is used for running arbiters for all shards. We are using a RAID5 (moving to RAID10) volume consisting of 6 EBS volumes for each MongoDB machine. All instances are m2x.large but we plan to migrate into larger instances.

More Amazon technology at work:

  • we are using ELB as a front-end for our proprietary load-balancers and for off-loading HTTPS processing
  • we are using S3 for storing pictures sent between users.

Q: How do you monitor your MongoDB cluster? Are there people in your team dedicated to managing the MongoDB cluster?

We have a small team to support our application and MongoDB cluster (we’re looking for MongoDB admins, BTW). We use our own monitoring server to monitor both cluster and a 10Gen MMS (Mongo Monitoring Service) to solely monitor MongoDB.

Q: Your platform has seen amazing growth reaching 18 mil. active users in less than 1 year. What has this growth meant in terms of evolving and managing the MongoDB deployment?

Hard work :). MongoDB has been very useful for increasing our reach to active users. Our exact methods are proprietary and therefore cannot be disclosed.

Q: What were the most notable moments in the evolution of your MongoDB cluster? Has it seen any radical changes over the time? Did you have to migrate your cluster to newer versions of MongoDB, etc.?

We have migrated versions from 1.7.6 to 1.8 and now to 2.0. We are still having a few problems with the last version, but we keep improving all the time.

Q: Were there any (major) bumps in the road with MongoDB? Or differently put, are there areas in which you’d like to see MongoDB improving?

  1. The database of the config server is not recovering (no master-slave). This misunderstanding has caused us to have 24 hours’ downtime with Viber at the beginning.
  2. The memory consumption of MongoDB is too high.

Thanks guys and good luck growing your platform!


  1. My thanks also to Meghan Gill and Darah Roslyn which helped getting this interview.  

Original title and link: MongoDB at Viber Media: The Platform Enabling Free Phone Calls and Text Messaging for Over 18 Million Active Users (NoSQL database©myNoSQL)

15:42 DNS (5158 Bytes) » 玉面飞龙的BLOG

在CAP理论中,DNS占据了AP(Availability & partition).

按照数据库的思想来说,DNS可以说是分布式的,层次结构的,KEY-VALUE的,高可用的系统。

分布式和层次结构
DNS的层次结构和分布式结构

网址www.yumianfeilong.com,其实就是www.yumianfeilong.com. (末尾有个”.”,表示root). “.”就是Root, “.com”就是TLD DNS, ”.com.yumianfeilong“是次级DNS. “www”可以说是该次级DNS内的一台server。

每个层次的DNS存储下一级DNS的地址。这样当要查找一个DNS对应的IP时候,可以查找root dns得到下一级DNS,再去下一级DNS递归查找。

分布式查询
DNS查询

DNS查询最常见的2种方式:Recursive queries:和Iterative (or nonrecursive) queries。如上为Recursive模式,表示为客户端的DNS解析完全由DNS resolve代劳,具体步骤如下:

1. A user types the URL http://www.example.com into a browser.
2. The browser sends a request for the IP address of www.example.com to its local
resolver (stub-resolver).
3. The stub-resolver queries the locally configured DNS Resolver for the IP
address of www.example.com.
4. The DNS Resolver looks up www.example.com in local tables (its cache), but it
isn’t found.
5. The DNS Resolver sends a query to a root-server for the IP (the A RR) of
www.example.com.
6. The root-server only supports iterative (nonrecursive) queries (see the
upcoming section “Iterative (Nonrecursive) Queries”) and answers with a list of
name servers that are authoritative for the next level in the domain name
hierarchy, which in this case is the gTLD .com (this is called a referral).
7. The DNS Resolver selects one of the authoritative gTLD servers received in the
previous referral and sends it a query for the IP of www.example.com.
8. The gTLD name server only supports iterative queries and answers with the
authoritative name servers for the Second-Level Domain (SLD) example.com (a
referral).
9. The DNS Resolver selects one of the authoritative DNS servers for example.com
from the previous referral and sends it a query for the IP (the A RR) of
www.example.com.
10. The zone file for example.com defines www.example.com as a CNAME record (an
alias) for joe.example.com. The authoritative name server answers with the
www.example.com CNAME RR and, in this case, the A RR for joe.example.com,
which we will assume is 192.168.254.2.
11. The DNS Resolver sends the response joe.example.com=192.168.254.2
(together with the CNAME RR www=joe) to the original client stub-resolver.
12. The stub-resolver sends www.example.com=192.168.254.2 to the user’s browser.
13. The browser sends a request to 192.168.254.2 for the web page.

KEY-VALUE

显而易见,查询域名,返回IP,也可以查询IP,返回域名(Reverse Mapping)。nslookup和dig都是常用的命令。

nslookup yumianfeilong.com.

Non-authoritative answer:
Name:   yumianfeilong.com
Address: 69.163.181.118

nslookup 69.163.181.118

Non-authoritative answer:
118.181.163.69.in-addr.arpa     name = apache2-bongo.yerevan.dreamhost.com.

Authoritative answers can be found from:
181.163.69.in-addr.arpa nameserver = ns1.dreamhost.com.
181.163.69.in-addr.arpa nameserver = ns3.dreamhost.com.
181.163.69.in-addr.arpa nameserver = ns2.dreamhost.com.
ns1.dreamhost.com       internet address = 66.33.206.206
ns2.dreamhost.com       internet address = 208.96.10.221
ns3.dreamhost.com       internet address = 66.33.216.216


其中DNS Reverse Mapping通过保留域名“IN-ADDR.ARPA.” (IPV4)实现。实现方式和查询某个域名的IP地址类似,也有Recursive 和non-Recursive 方式。

如下为Recursive方式查询IP地址192.168.250.15对应的域名。

reverse mappting的设计思路在某些应用设计中或许有借鉴意义。

高可用性

自然是复制,Master-Slave模式。 这也部分导致了难于实现实时一致性。在master上做了DNS改动后,需要一些时间Push到slave上和其他有cache过期DNS记录的DNS上。

内容参考Pro DNS and BIND

15:38 Implementing Auto Saves Using RavenDB: NoSQL Tutorials (1940 Bytes) » myNoSQL
Implementing Auto Saves Using RavenDB: NoSQL Tutorials:

[…] implementing Auto Save in the RDBMS system could be a problem because of multiple reasons:

  • The schema and overall logic changes to save versioned data in the RDBMS system will be non-trivial
  • There might be validation checks that fail because users kept didn’t fill out some fields at that point.
  • Making periodic (30 second) transactional updates to any live system is not good for overall performance.

A work around would be saving your Object Model to RavenDB directly and if user visits the document after a time out, load both Transactional Data and Object data, compare the timestamp and use the freshest set of data.

By far the best document database usecase I have read about in quite a while.

Original title and link: Implementing Auto Saves Using RavenDB: NoSQL Tutorials (NoSQL database©myNoSQL)

09:47 ACOUG 2012年2月 Ask Tom and Eygle - 上海 (2225 Bytes) » Oracle Life

作者:eygle 发布在 eygle.com

ACOUG在2012年的首次活动将来到上海,重量级的嘉宾是Thomas Kyte,ASKTOM 网站背后的老大,Oracle的副总裁。

在这个活动中,我还有一个主题演讲,大家有兴趣的请尽快报名:
http://www.acoug.org/events/239.html

具体信息请关注 ACOUG 微博:
http://weibo.com/acoug


TomEygle.jpg

相关文章|Related Articles

评论数量(2)|Add Comments

本文网址:

07:38 Critical Oracle Database Bug - System Change Number (SCN) (CVE-2012-0082) (2251 Bytes) » Oracle Security Blog
InfoWorld magazine today published detailed information regarding Oracle Database security bug CVE-2012-0082, which has associated fixes in the Oracle's January 2012 Critical Patch Update.  This security vulnerability specifically relates to the Oracle System Change Number (SCN) and ways to increase the SCN beyond the current maximum value (SCN Headroom or Maximum Reasonable SCN) in order to stop processing of database transactions. 

Where this vulnerability gets interesting is that the SCN is synchronized to the highest SCN when two databases are connected via a database link.  Therefore, it is possible to increase a database to the near maximum SCN through a database link, which will cascade through to all other interconnected databases.  The result can be ORA-600 errors and potentially database crashes on the database with the lower SCN.

This vulnerability appears to have been discovered as the result of a bug in RMAN which can cause the SCN to reach current maximum SCN value and a change in the way the Maximum Reasonable SCN is calculated in 11.2.0.2.  The 11.2.0.2 change appears to have impacted or crashed at least a hundred databases at a very large Oracle customer.

As this vulnerability will get significant press, we foresee an "arms race" ensuing with release of different methods to maliciously increment the current SCN and techniques to perform database denial of services attacks related to the SCN.

Integrigy will be publishing in the near future our analysis of the impact of this vulnerability along with recommendations on mitigating the risk in your organization.

Oracle has published more information regarding SCNs and potential impact in a My Oracle Support (MOS) note (requires My Oracle Support access) -

Information on the System Change Number (SCN) and how it is used in the Oracle Database [ID 1376995.1]
06:55 Setting Up, Modeling and Loading Data in HBase With Hadoop and Clojure: NoSQL Tutorials (1750 Bytes) » myNoSQL
Setting Up, Modeling and Loading Data in HBase With Hadoop and Clojure: NoSQL Tutorials:

Even if you are not familiar with Clojure, you’ll still enjoy this fantastic HBase tutorial:

And that’s the thing: if you are loading literally gajigabytes of data into HBase you need to be pretty sure that it’s going to be able to answer your questions in a reasonable amount of time. Simply cramming it in there probably won’t work (indeed, that approach probably won’t work great for anything). I loaded and re-loaded a test set of twenty thousand rows until I had something that worked.

Original title and link: Setting Up, Modeling and Loading Data in HBase With Hadoop and Clojure: NoSQL Tutorials (NoSQL database©myNoSQL)

05:30 更多Oracle DBA职位 Vancouver Burnaby (801 Bytes) » 木匠 Creative and Flexible

地点在温哥华Burnaby, 位置不错,毗邻SFU. 公司文化也不错,平衡生活,看中成绩和结果. 年收入100k+.
以前的老同事兼老朋友的部门在找人,他是部门的头儿.

http://www.vivonet.com/about-us/careers/sr-database-administrator

有兴趣的,请跟我联系. 加拿大和美国找工作,主要也是靠人际网络和朋友推荐.

因为工作和办公室周围环境好,可以随时去SFU大学散步,我都有点心动了.
看来Oracle DBA的工作稳定性和安全性还是相当高的, 本地机会不断.  ^_^

02:43 Hadoop Versions Take 2: What You Wanted to Know About Hadoop, but Were Too Afraid to Ask: Genealogy of Elephants (3279 Bytes) » myNoSQL
Hadoop Versions Take 2: What You Wanted to Know About Hadoop, but Were Too Afraid to Ask: Genealogy of Elephants:

Another great diagram explaining the complicated tree of Hadoop versions.

Apache Hadoop Versions

Click for full size image. Credit Konstantin I. Boudnik & Cos

When compared with the other diagram of Apache Hadoop versions, this one contains some very interesting details about the versions of Hadoop used by third party distributions like EMC, IBM, MapR, and even Azure:

The diagram above clearly shows a few important gaps of the rest of commercial offerings:

  • none of them supports Kerberos security (EMC, IBM, and MapR)
  • unavailability of Hbase due to the lack of HDFS append in their systems (EMC, IBM). In case of MapR you end up using a custom HBase distributed by MapR. I don’t want to make any speculation of the latter in this article.

If I’d be in position to choose which version of Hadoop to be used for a project, here is where I’d start from:

  1. if the project would have a budget for prototyping and experimentation, my first choice would be the latest official Apache distribution. This would give access to both the latest and greatest (and not always bug free), but more importantly it would allow the team to access the Hadoop community know-how
  2. if the project would require getting up to speed as fast as possible (and I’d be able to get some budget for trainings), I’d start my investigation with Cloudera Distribution of Hadoop. Even if there would be no budget for getting support for Cloudera, the advantage would be in having everything well packaged together.

Original title and link: Hadoop Versions Take 2: What You Wanted to Know About Hadoop, but Were Too Afraid to Ask: Genealogy of Elephants (NoSQL database©myNoSQL)

2012-01-17 Tue