123
 123

Tip: 看不到本站引用 Flickr 的图片? 下载 Firefox Access Flickr 插件 | AD: 订阅 DBA notes --

2010-06-09 Wed

16:19 Presentation: Hive - A Petabyte Scale Data Warehouse Using Hadoop (1765 Bytes) » myNoSQL

Lately I’ve been mentioning Hive quite a few times when writing about working with NoSQL data, but I was missing a good slidedeck providing details of the Hive architecture, usage scenarios, and other interesting details about Hive.

The presentation embedded below coming from the Facebook Data Infrastructure team provides all these details and much more (i.e. Hive usage at Facebook, Hadoop and Hive clusters, etc.)

09:00 What I learned about deadlines... (3962 Bytes) » The Tom Kyte Blog
I learned that I am not the only one :) Seth's blog is one of the ones I read every time. They are short, to the point and almost always meaningful to me. Deadlines are the greatest motivator for me - if I do not have a deadline for something, I can pretty much guarantee you I will not finish it. I set my own little deadlines for things just to get finished. Whenever someone asks me to do something for them - write a foreword, make a recommendation, whatever - I typically say "sure and what is the drop dead date". If they know me, they'll give me a date before the true 'drop dead' just to have it in a timely fashion (because the odds they see it before then are slim to none).

Speaking of deadlines, I just finished the 2nd edition of Expert Oracle Database Architecture. Right now, this minute. Just have to dot I's and cross T's now - a few final copy edits and it'll be done. This will be the blurb on the back of the book (which you can expect to see soon)

Expert Oracle Database Architecture

Dear Reader,
I have a simple philosophy when it comes to the Oracle database: you can treat it as a black box and just stick data into it, or you can understand how it works and exploit it fully. If you choose the former, you will, at best, waste money and miss the potential of your IT environment. At worst, you will create nonscalable and incorrectly implemented applications—ones that damage your data integrity and, in short, give incorrect information. If you choose to understand exactly how the Oracle database platform should be used, then you will find that there are few information management problems that you cannot solve quickly and elegantly.

Expert Oracle Database Architecture is a book that explores and defines the Oracle database. In this book I’ve selected what I consider to be the most important Oracle architecture features, and I teach them in a proof-by-example manner, explaining not only what each feature is, but also how it works, how to implement software using it, and the common pitfalls associated with it. In this second edition, I’ve added new material reflecting the way that Oracle Database 11g Release 2 works, updated stories about implementation pitfalls, and new capabilities in the current release of the database. The number of changes between the first and second editions of this book might surprise you. Many times as I was updating the material – I myself was surprised to discover changes in the way Oracle worked that I was not yet aware of. In addition to updating the material to reflect the manner in which Oracle Database 11g Release 2 works – I’ve added an entirely new chapter on data encryption. Oracle Database 10g Release 2 added a key new capability – transparent column encryption – and Oracle Database 11g Release 1 introduced transparent tablespace encryption. This new chapter takes a look at the implementation details of these two key features as well as manual approaches to data encryption.

This book is a reflection of what I do every day. The material within covers topics and questions that I see people continually struggling with, and I cover these issues from a perspective of "When I use this, I do it this way." This book is the culmination of many years’ experience using the Oracle database, in myriad situations. Ultimately, its goal is to help DBAs and developers work together to build correct, high-performance, and scalable Oracle applications.

Thanks and enjoy!

06:19 Tutorial: Getting Started with MongoDB and PHP (1888 Bytes) » myNoSQL
Tutorial: Getting Started with MongoDB and PHP:

Not that we are short on MongoDB and PHP tutorials, but PHP programmers seem to have fun with MongoDB:

In this article, I’ll introduce you to MongoDB, one of the new generation of schema-less database systems that is quickly gaining the attention of open source developers. Over the next few pages, I’ll guide you through the process of getting started with MongoDB, showing you how to install it, set up a data store, connect to it and read and write data using PHP. Let’s get started!

[…]

As these examples illustrate, MongoDB provides a solid, feature-rich implementation of a schema-less database system. Availability for different platforms, easy integration with PHP and other languages, and extensive documentation (plus a very cool interactive online shell for experimentation) make it ideal for developers looking for a modern, document-oriented database. Try it out sometime, and see what you think!

06:01 A Headless Web Site Screenshot Service using Redis and Resque (1233 Bytes) » myNoSQL
A Headless Web Site Screenshot Service using Redis and Resque:

Just another example of using Redis-based queues to build a headless web site screenshot service.

#add a sample
#ruby sample.rb
QUEUE=* rake resque:work
rescue-web #optional

#run the webserver
ruby server.rb

wget http://localhost:4567/schedule?url=http://www.skroutz.gr
    &callback=http://www.mysite.com/handle_screenshot
01:11 RHCS Fencing device instruction (1689 Bytes) » DBA@SKY-MOBI
cluster fencing: fence是rhcs中的重要组件,目的是防止failed的节点修改共享数据,防止共享数据分裂。 电源fence系统: 电源FENCE系统在集群中的所有节点是相互可以访问的,一般使用ETH网连接。有扩展卡的或主板集成的类型。 红帽支持的电源fence系统 Manufacturer Model Bull Fame (PAP) Management Console Dell DRAC 3 Dell DRAC 4 Dell DRAC 5 Dell DRAC/MC Fujitsu-Siemens RSB HP ILO HP ILO 2 IBM Blade Center IBM RSA II Intel IPMI over LAN 专业的电源管理设备也可以被红帽支持,如下: Manufacturer Model APC MasterSwitch AP7902 APC MasterSwitch AP7930 – AP7998 APC MasterSwitch AP7900 APC MasterSwitch AP7901 APC MasterSwitch AP7911 APC MasterSwitch AP7920 APC MasterSwitch AP7921 WTI IPS-15 WTI IPS-1600 WTI IPS-1600-CE WTI IPS-400 WTI IPS-400-CE WTI IPS-800 WTI IPS-800-CE WTI NBB-1600 WTI NBB-1600-CE WTI TPS-2 Note: Supported on Red Hat Enterprise Linux 4 and 5 基于SAN的fence 与power fence的功能类似,san fence的功能是断开failed节点到共享存储的连接。但是有一点不能做到的是,如果启用了共享IP的话,SAN FENCE是不够的。 红帽支持的SAN交换机如下: Manufacturer Model Brocade Silkworm 2400 Brocade Silkworm 2800 Brocade Silkworm 3200 Dell PowerVault 56F McData Sphereon 4500 Vixel 9200 Note: Supported on Red Hat Enterprise Linux 4 and 5 虚拟机FENCE 使用 fence_xvm agent告知failed的虚拟机的宿主机fence虚拟机。 SCSI-3 FENCE设备 通知存储,FENCE掉failed的节点与该LUN的通信,需要非多路径环境支持。

2010-06-08 Tue

22:44 Using pg_statsinfo monitor PostgreSQL v8.3,v8.4,v9.0 (1154 Bytes) » DBA@SKY-MOBI
pg_statsinfo的架构如下: 分为三个组件: 1. pg_statsinfo 部署在被监控的数据库端,用于采集数据库瞬间状态,过滤数据库csv日志,需要与repository DB通信. 2. pg_reporter 部署在HTML报告服务器上,需要与repository DB通信,与被监控的数据库通信(可选). 3. repository DB 用于存放pg_statsinfo发送过来的snapshot报告。被pg_reporter调用,生产HTML报告。 另外,非常强的一点是可以自己编写模板。 架构如图: 报告分为两类: 第一类是pg_statsinfo,需要在repo数据库安装pg_statsinfo支持. 第二类是schema,需要有连接到被监控数据库的配置。 下面是statsinfo报告的介绍: 1. Summary name 5480307906522906617 hostname db-172-16-3-33.sky-mobi.com.hz port 1921 pg_version 9.0beta2 snapshot begin 2010-06-08 18:04:52 snapshot end 2010-06-09 13:30:00 snapshot duration 19:25:09 total database size 5073 kB total commits 18698 total rollbacks 2 2.Database Statistics ID database MB +MB commit/s rollback/s hit% gets/s reads/s rows/s 1 postgres 4 0 0.267 0.000 99.900 17.772 0.016 95.099 2 test 26 26 0.047 0.000 99.800 23.219 0.043 [...]
20:37 预祝alan,lori,doris的ocm考试通过 (447 Bytes) » OracleDBA Blog---三少个人涂鸦地!

今天得到lori的消息,他也报名参加ocm考试了。
以前我们在zmcc团队一共四个人,我,alan,lori,doris。去年11月,我通过了ocm考试,后来由于种种原因,今年的3月选择了离开那个团队。
如果他们三个一起考过,那应该是这个team原来的四个人全部都是ocm了,也是可以让我自豪的事情了。
在此,预祝他们,一次全部通过。

19:56 Table locks in SHOW INNODB STATUS (5361 Bytes) » MySQL Performance Blog

Quite frequently I see people confused what table locks reported by SHOW INNODB STATUS really mean. Check this out for example:

SQL:
  1. ---TRANSACTION 0 4872, ACTIVE 32 sec, process no 7142, OS thread id 1141287232
  2. 2 LOCK struct(s), heap size 368
  3. MySQL thread id 8, query id 164 localhost root
  4. TABLE LOCK TABLE `test/t1` trx id 0 4872 LOCK mode IX

This output gives us an impression Innodb has taken table lock on test/t1 table and many people tend to think Innodb in fact in some circumstances would abandon its row level locking and use table locks instead. I've seen various theories ranging from lock escalation to using table locks in special cases, for example when no indexes are defined on the table. None of this is right.

In fact Innodb uses Multiple Granularity Locking and there is always lock taken on the whole table before individual locks can be locked.
Such locks are called intention lock, hence abbreviation IX = Intention eXclusive. Intention locks do not work the same way as table locks - Intention exclusive lock does not prevent other threads taking intention shared or even intention exclusive locks on the same table.

What does Intention mean ? Just what it says. If Innodb sets intention exclusive lock on the table this means it plans to lock some of the rows in exclusive mode. What would these be used for ? They are used to be able to handle operation on the whole table - for example to drop the table you need to lock it exclusively.

So do not worry intention table locks you may observe in SHOW INNODB STATUS output, they almost never would be cause of your lock waits or deadlocks.


Entry posted by peter | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

10:58 Continued Rows (1 Bytes) » Oracle Scratchpad
09:05 Wherever I May Roam (5114 Bytes) » The Pythian Blog
    Roamer, wanderer
    Nomad, vagabond
    Call me what you will

    $ENV{LC_ALL} = "anywhere";
    my $time = localtime;
    say {$anywhere} my $mind;
    local *anywhere = sub { ... };

    Anywhere I roam
    Where I 'git ghclone environment' is $HOME

        # 'grep may_roam($_) => @everywhere',
        #                with apologies to Metallica

Laziness and a severe addiction to yak shaving conspire to constantly make me tweak configurations and hack scripts to make my everyday editing / shell / development experience as holistic as possible. Unfortunately the same laziness, combined with my constant hopping between home and $work computers, severely gets in the way of effectively using those optimizations. Indeed, although I have those nifty toys installed here and there, because they are not uniformly installed everywhere I constantly find myself using the machines’ functional lowest common denominator.

To fix that, I’ve began to dump all my environment’s custom configurations, plugins, tweaks and hacks on Github. That way, I can import my whole baseline toolbox on any given box with a simple

git clone git://github.com/yanick/environment.git

As an added bonus, it also provides me with a public platform to show off all my little tricks to the world — and a way to potentially let other peeps fork it and customize it to fit their own needs.

However, importing the environment is only half the battle; it also has to be properly installed. On one hand, the installation shouldn’t be manual, as laziness would slip in again and ensure that it would never happen. On the other, I’m too wary of unintentional clobbering to leave everything to an installation script. So I decided to take the middle road and have a set of passive Perl tests verifying if the various components are applied to the environment. For every tweak that I make, I also write a short test that checks that it is installed at the proper place. Thanks the goodness of Perl’s test harness, a quick ‘prove t‘ is all that is needed to let me know if the current environment is in sync with the baseline:

[yanick@enkidu environment (master)]$ prove t
t/general.t ... 1/?
#   Failed test 'cp bash/mine.bash ~/.bash/mine.bash'
#   at t/general.t line 15.
# +---+---------------------------------------------+---+-----------------------------------------+
# |   |Got                                          |   |Expected                                 |
# | Ln|                                             | Ln|                                         |
# +---+---------------------------------------------+---+-----------------------------------------+
# | 16|source ~/.bash/git-completion.bash           | 16|source ~/.bash/git-completion.bash       |
# | 17|PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '      | 17|PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '  |
# | 18|                                             | 18|                                         |
# * 19|export PATH="$PATH:~/work/git-achievements"  *   |                                         |
# * 20|alias git=git-achievements                   *   |                                         |
# | 21|                                             | 19|                                         |
# * 22|\n                                           *   |                                         |
# | 23|###########################                  | 20|###########################              |
# | 24|# Misc                                       | 21|# Misc                                   |
# | 25|###########################                  | 22|###########################              |
# +---+---------------------------------------------+---+-----------------------------------------+
# | 42|                                             | 39|                                         |
# | 43|complete -C perldoc_complete perldoc         | 40|complete -C perldoc_complete perldoc     |
# | 44|complete -C perldoc_complete pod             | 41|complete -C perldoc_complete pod         |
# |   |                                             * 42|\n                                       *
# |   |                                             * 43|\n                                       *
# |   |                                             * 44|# aliases                                *
# |   |                                             * 45|source ~/.bash/aliases                   *
# +---+---------------------------------------------+---+-----------------------------------------+
[ etc... ]

It’s not a perfect system, and there’s still a lot of polishing that can be done, but I’ve been using it for a few weeks and it has already proven its worth.

2010-06-07 Mon