123
 123

2008-04-22 Tue

21:22 Oracle的分区表和Local索引创建与维护 (13860 Bytes) » Oracle Life

©作者:eygle 发布在 eygle.com

Oracle的分区技术在某些条件下可以极大的提高查询的性能,所以被广泛采用。
从产品上说,分区技术是Oracle企业版中独立收费的一个组件。

以下是对于分区及本地索引的一个示例。

首先根据字典表创建一个测试分区表:
SQL> connect eygle/eygle
Connected.
SQL> CREATE TABLE dbobjs
  2  (OBJECT_ID        NUMBER NOT NULL,
  3  OBJECT_NAME      varchar2(128),
  4  CREATED          DATE  NOT NULL
  5  )
  6  PARTITION BY RANGE (CREATED)
  7  (PARTITION dbobjs_06 VALUES LESS THAN (TO_DATE('01/01/2007', 'DD/MM/YYYY')),
  8  PARTITION dbobjs_07 VALUES LESS THAN (TO_DATE('01/01/2008', 'DD/MM/YYYY')));

Table created.

SQL> COL segment_name for a20
SQL> COL PARTITION_NAME for a20
SQL> SELECT segment_name, partition_name, tablespace_name
  2    FROM dba_segments
  3  WHERE segment_name = 'DBOBJS';

SEGMENT_NAME        PARTITION_NAME      TABLESPACE_NAME
-------------------- -------------------- ------------------------------
DBOBJS              DBOBJS_06            EYGLE
DBOBJS              DBOBJS_07            EYGLE

创建一个Local索引,注意这里可以将不同分区的索引指定创建到不同的表空间:
SQL> CREATE INDEX dbobjs_idx ON dbobjs (created) LOCAL
  2  (PARTITION dbobjs_06 TABLESPACE users,
  3    PARTITION dbobjs_07 TABLESPACE users
  4    );

Index created.
这个子句可以进一步调整为类似:
CREATE INDEX dbobjs_idx ON dbobjs (created) LOCAL
(PARTITION dbobjs_06 TABLESPACE users,
  PARTITION dbobjs_07 TABLESPACE users
  ) TABLESPACE users;
通过统一的tablespace子句为索引指定表空间。 

SQL> COL segment_name for a20
SQL> COL PARTITION_NAME for a20
SQL> SELECT segment_name, partition_name, tablespace_name
  2    FROM dba_segments
  3  WHERE segment_name = 'DBOBJS_IDX';

SEGMENT_NAME        PARTITION_NAME      TABLESPACE_NAME
-------------------- -------------------- ------------------------------
DBOBJS_IDX          DBOBJS_06            USERS
DBOBJS_IDX          DBOBJS_07            USERS


SQL> insert into dbobjs
  2 select object_id,object_name,created
  3 from dba_objects where created <to_date('01/01/2008','dd/mm/yyyy') and object_id is not null; 

6227 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from dbobjs partition (DBOBJS_06);

  COUNT(*)
----------
      6154

SQL> select count(*) from dbobjs partition (dbobjs_07);

  COUNT(*)
----------
        73

我们可以通过查询来对比一下分区表和非分区表的查询性能差异:
SQL> set autotrace on
SQL> select count(*) from dbobjs where created < to_date('01/01/2008','dd/mm/yyyy');

  COUNT(*)
----------
      6227


Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=9)
  1    0  SORT (AGGREGATE)
  2    1    PARTITION RANGE (ALL)
  3    2      INDEX (RANGE SCAN) OF 'DBOBJS_IDX' (NON-UNIQUE) (Cost=2 Card=8 Bytes=72)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        25  consistent gets
          0  physical reads
          0  redo size
        380  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(*) from dbobjs where created < to_date('01/01/2007','dd/mm/yyyy');

  COUNT(*)
----------
      6154


Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=9)
  1    0  SORT (AGGREGATE)
  2    1    INDEX (RANGE SCAN) OF 'DBOBJS_IDX' (NON-UNIQUE) (Cost=2 Card=4 Bytes=36)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        24  consistent gets
          0  physical reads
          0  redo size
        380  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(distinct(object_name)) from dbobjs where created < to_date('01/01/2007','dd/mm/yyyy');

COUNT(DISTINCT(OBJECT_NAME))
----------------------------
                        4753


Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=75)
  1    0  SORT (GROUP BY)
  2    1    TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'DBOBJS' (Cost=1 Card=4 Bytes=300)
  3    2      INDEX (RANGE SCAN) OF 'DBOBJS_IDX' (NON-UNIQUE) (Cost=2 Card=1)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        101  consistent gets
          0  physical reads
          0  redo size
        400  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

对于非分区表的测试:
SQL> CREATE TABLE dbobjs2
  2  (object_id    NUMBER NOT NULL,
  3  object_name  VARCHAR2(128),
  4  created      DATE  NOT NULL
  5  );

Table created.

SQL> CREATE INDEX dbobjs_idx2 ON dbobjs2 (created);

Index created.

SQL> insert into dbobjs2
  2  select object_id,object_name,created
  3  from dba_objects where created <to_date('01/01/2008','dd/mm/yyyy') and object_id is not null;

6227 rows created.

SQL> commit;

Commit complete.

SQL> select count(distinct(object_name)) from dbobjs2 where created < to_date('01/01/2007','dd/mm/yyyy');

COUNT(DISTINCT(OBJECT_NAME))
----------------------------
                        4753


Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE
  1    0  SORT (GROUP BY)
  2    1    TABLE ACCESS (BY INDEX ROWID) OF 'DBOBJS2'
  3    2      INDEX (RANGE SCAN) OF 'DBOBJS_IDX2' (NON-UNIQUE)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      2670  consistent gets
          0  physical reads
      1332  redo size
        400  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

当增加表分区时,LOCAL索引被自动维护:
SQL> ALTER TABLE dbobjs
  2  ADD PARTITION dbobjs_08 VALUES LESS THAN (TO_DATE('01/01/2009', 'DD/MM/YYYY'));

Table altered.

SQL> set autotrace off
SQL> COL segment_name for a20
SQL> COL PARTITION_NAME for a20
SQL> SELECT segment_name, partition_name, tablespace_name
  2    FROM dba_segments
  3  WHERE segment_name = 'DBOBJS_IDX';

SEGMENT_NAME        PARTITION_NAME      TABLESPACE_NAME
-------------------- -------------------- ------------------------------
DBOBJS_IDX          DBOBJS_06            USERS
DBOBJS_IDX          DBOBJS_07            USERS
DBOBJS_IDX          DBOBJS_08            EYGLE

SQL> SELECT segment_name, partition_name, tablespace_name
  2    FROM dba_segments
  3  WHERE segment_name = 'DBOBJS';

SEGMENT_NAME        PARTITION_NAME      TABLESPACE_NAME
-------------------- -------------------- ------------------------------
DBOBJS              DBOBJS_06            EYGLE
DBOBJS              DBOBJS_07            EYGLE
DBOBJS              DBOBJS_08            EYGLE

-The End-

相关文章|Related Articles

评论数量(0)|Add Comments

本文网址:

18:41 How to estimate query completion time in MySQL (5597 Bytes) » MySQL Performance Blog

Have you ever run a query in MySQL and wondered how long it'll take to complete? Many people have had this experience. It's not a big deal until the query has been running for an hour. Or a day and a half. Just when IS that query going to finish, anyway?

There are actually a few ways to estimate how long it'll take for the query to complete, depending on what the query is. One of the simplest is to estimate how many rows the query needs to examine, measure how fast it's working, and do the math.

As an example, I recently worked on a customer's site where a typical data-warehousing query needed optimization. It was a fact table joined to two dimension tables -- a classic star schema query. The fact table was very large, and after some tuning (I'll write more about that later) I convinced MySQL to perform the query as a table scan of the fact table, then an index lookup in each dimension table in turn.

The table structures aren't really important. All you need to know for this post is that the fact table has about 150 million rows and the query was taking over 10 minutes to complete. Actually, it had never completed at all, according to the customer. I wanted to know whether I'd be waiting for another minute, hours, or days.

The answer was simple, because there was nothing else running on the server. That means that SHOW GLOBAL STATUS gave a rough idea of what the query was actually doing. (If there had been a lot of activity on the server, I wouldn't have been able to say with confidence that SHOW GLOBAL STATUS showed what that one query was doing; activity from other queries would have been mixed in there too. It would be great to be able to choose another thread and watch only its status, but MySQL doesn't currently let you do that.)

The solution was to measure how fast the query was scanning rows in the table scan of the fact table. This is shown by the Handler_read_rnd_next status variable. Here's an easy way to watch it (innotop is another handy way):

CODE:
  1. mysqladmin extended -r -i 10 | grep Handler_read_rnd_next
  2. -- ignore the first line of output...
  3. | Handler_read_rnd_next             | 429224      |

So the server was reading roughly 43K rows per second, and there were 150 million rows in the table. A little math later, and you get 3488 seconds to completion, or a little less than an hour. And indeed the query completed in about 55 minutes.

This is the simplest case, and there are more complicated ones to consider, but hopefully this gives you an idea how you can tackle this problem in different situations.


Entry posted by Baron Schwartz | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

10:33 Tips and tricks: How do I resize multiple image files using command-line tools? (2710 Bytes) » Red Hat Magazine

Image file size depends on the amount of image details it has. The higher the image quality, the better the resolution but the larger the resulting file size. Web galleries tend to load very slow and sometimes inaccessbile if photos have very high resolutions.

The mogrify command allows users to resize multiple image files at once. mogrify comes with the ImageMagick package. ImageMagick is a program used to resize, rotate, manipulate and display images. It can read, convert and write images in JPEG, TIFF, PNM, GIF and other Photo CD image formats.

Check if the ImageMagick package is installed:

rpm -q ImageMagick

To install the ImageMagick package on Red Hat Enterprise Linux 3 or 4, use the up2date command:

up2date ImageMagick

On Red Hat Enterprise Linux 5, use the yum command:

yum install ImageMagick

After installing ImageMagick package, use mogrify to resize multiple image files. See example command below in resizing all JPG photos in the current working directory to 640 pixels in width while maintaining its original aspect ratio:

mogrify -resize 640 *.jpg

To resize the resolution of all JPG image files to exactly 640 pixels in width and 480 pixels in height, use:

mogrify -resize 640x480! *.jpg
08:39 Varnish让内核来控制那些内容缓存在内存中/那些在磁盘中; ArchitectNotes - Varnish的架构说明 [del.icio.us] (246 Bytes) » 车东[Blog^2]
Varnish allocate some virtual memory, it tells the operating system to back this memory with space from a disk file. When it needs to send the object to a client, it simply refers to that piece of virtual memory and leaves the rest to the kernel.
06:17 [AD] “我不是一个塑料袋” 博客大巴环保袋设计大赛 一等奖为MacBook Air + 5000¥ (692 Bytes) » 车东[Blog^2]
“我不是塑料袋”是一个很有趣的设计创意: I'm not a plastic bag 此次我不是塑料袋:环保袋设计大赛活动的奖品为: 评委会奖(1名):MacBook Air + 5000元现金 ( 歌手Yael Naim的那首《New Soul》你每天在电梯中都会听到多次吧? ) 独树一帜奖(2名):价值5000元以上的尼康D60单反相机1台 + 5000元现金 ...
05:01 山东之雄伟,艳丽,宁静 (1826 Bytes) » OracleBlog.cn

前段时间,在山东出差,顺便逛了几处名胜,深感山东之雄伟,艳丽,宁静……

泰山天街:

泰山南天门:

泰山眺望:

泉城广场:

趵突泉_1:

趵突泉_2:

趵突泉_3:

05:00 eBay 的 Personalization Platform 采用 MySQL (4902 Bytes) » DBA notes

作者:Fenng 发布在 dbanotes.net. FeedBurner 订阅数量,点击则可进行订阅

过去写过很多关于 eBay 数据平台架构的帖子,过去eBay 的信息架构里 DB 都是采用 Oracle 的,大多数 DBA 朋友也都知道 eBay 在 Oracle 方面的技术搞得非常好。这次的 The 2008 MySQL Conference & Expo 披露出来的信息,eBay 在 MySQL 上做了很大胆的尝试,eBay Personalization Platform 就是用 MySQL 打造的。Sun 当然不会放弃这个大好的宣传机会(这两家在技术上的合作一向也比较多),所以年度最佳应用给了 eBay (一同获奖的还有 Virgin Mobile France 和 Facebook )。

面临的应用场景:客户端 Cookie 最大 4K,如果要传递更多定制化信息就不好搞了。作为电子商务站点,肯定有要为用户提供更具有关联性的商品信息的业务需求,这样就要跳出原有的窠臼。通过数据库集群来存储类似的信息就是有必要的,但 eBay 原有 Oracle 数据库上的压力已经很大。

eBay 采用 MySQL Memory Engine 做数据库 Cache 层解决方案(如果纯粹用 Memcached 类似的方案也不太适合的,读写比例接近), eBay 工程师 Igor Chernyshev 对内存引擎做了质的改进,而这些改进是开放出来的(Mysql-heap-dynamic-rows 项目,对 VARCHAR 列的内存开销算法做了革命性的改进)。另外一个 Patch 扩充了并发能力,一台普通的 Sun 4100 上能支撑 20000 个并发连接。每秒钟处理 13000 个 TPS,读写各半。25 台机器组成的集群,每天支撑 40 亿次的读写请求,为每个用户传递的定制数据平均大小 40 K,从 4K 到 40K ,足够多的定制信息可以存储了。

架构示意图(来源):
eBay_mysql_platform.png

这个个性化平台系统虽说关键,但是存储的数据并非不能丢失的。这也是 eBay 大胆采用 MySQL 的一个考虑因素吧。

MySQL 更大规模部署时代似乎来临。

--EOF--

相关文章|Related Articles

评论数量(0)|Add Comments

本文网址:
最近作者还说了什么? Follow Twitter / Fenng

03:38 最近 (7881 Bytes) » blue_prince

杭州的春天周末老是下雨,上个周末再次印证了我的推测。来杭州快2年了,去年开始发现杭州春季时工作日的天气都出奇的好,但一到周末就下雨。好不容易有个休息日,想出去走走都不成。

最近迷上了HP新推出的迷你小本本HP 2133。一直以来都只喜欢小本本,以前SONYSRX系列是我了最爱。不过那时候是穷学生,买不起,现在还想收个成色好的二手收藏来着,不过PIII时代的配置到现在确实是过时了。用过DELLX300,那个本本离我心目中的完美本本已经非常接近了,轻、薄,香槟银镁铝合金外壳,当时由于过于轻薄,后面屏幕被压出了一圈白斑,不得不出掉了。12寸的屏幕我认为还是大了些,10寸左右的正好,这样键盘也不至于太小,要是键盘太小打字就不方便了,为了轻薄牺牲了易用性也是我所不能容忍的。HP2133的配置离我心目中的超轻薄本本的要求已经非常接近了,当然8.9寸的屏幕是小了些,不过上盖多出的空间用于音箱,这样本本的音质就会好些。况且键盘基本上达到95%的标准键盘大小。当然用VIACPU是比较不爽的,不过进入迅驰时代以后我对CPU的要求便越来越低了。不玩游戏,只是用电脑看看碟,上上网啥的,现在主流的硬件配置已经足够了。当然还有很重要的一点便是价格便宜,VIA 1.2G CPU1G内存、120G硬盘的配置卖549美刀,和众多超轻薄本本高不可攀的价格相比,这个价格是够平易近人了。当然外观也是我所注重的,铝合金银色外观是我所喜欢的。嗯,准备到时候整一台,这样外出的时候我就可以只背着这个1.2KG的本本了。

把《24小时》的第一季看完了。继《越狱》之后,最近疯狂地迷恋上了美剧,最疯狂的便是前段时间看《LOST》,一口气看到最新的第四季第8集。看完《LOST》后,以后坐飞机是会担心的,还有对里面时不时出现的一些恐怖场景比较不爽,我胆子还是小的。看美剧我喜欢下载中英文双字幕的版本,这样看的时候尽量看英文字幕,碰到不懂的单词时再看中文字幕。这样对学英语还是有挺大帮助的,明显感觉自己语感好了不少。发现美剧里面的对话都是很简单的,大部分都能看得懂,就是速度太快,有时候来不及只好看中文字幕。以前下载电影时都特地下AVI格式的,然后下载中英文字幕,先看英文,碰到不懂的时候就停下来翻金山词霸,那样看起来怪累人的,最后便都看中文了。现在有些RMVB会有字幕组把中英文字幕都整合起来,看起来就方便多了。我是懒人,这样子学英语对于目前的我来说是个最好的办法。

朋友来杭州工作,周末陪他一起四处找房子。发现现在的房东大部分都是跟自己年龄相仿的80后,想想自己都快租不起房子,人家都在出租房子,差距啊!

ORACLE 10.2.0.4BUG的补丁好不容易推出了,结果我一装上就报错了。昨天折腾了一下午,昨晚上下班都走到门口了,结果又报错了,只好回来再整。后面不得不把这个PATCH回滚,数据库才恢复正常。对ORACLESR已经彻底失望了,不断地跟你来回折腾,耗到你没有心思再跟为止。

03:19 随笔 (582 Bytes) » Think in 88
      这几天的工作比较无聊,帮运营的同事提数据,同情数据仓库的同事,这实在是世界上最烦人的工作~MS工作总是烦人与无趣的~激情不知道哪里去了~
02:12 数据库迁移几种方式简单介绍 (4307 Bytes) » OracleBlog.cn

我们常常需要对数据进行迁移,迁移到更加高级的主机上、迁移到远程的机房上、迁移到不同的平台下……

一、exp/imp:

这也算是最常用最简单的方法了,一般是基于应用的owner级做导出导入。
  
    操作方法为:在新库建立好owner和表空间,停老库的应用,在老库做exp user/pwd owner=XXX file=exp_xxx.dmp log=exp_xxx.log buffer=6000000,传dmp文件到新库,在新库做imp user/pwd fromuser=XXX touser=XXX file=exp_xxx.dmp log=imp_xxx.log ignore=y.
 
         优缺点:优点是可以跨平台使用;缺点是停机时间长,停机时间为从exp到网络传输到新库,再加上imp的时间。

二、存储迁移:

这种情况下,数据文件、控制文件、日志文件、spfile都在存储上(一般情况下是裸设备),我们可以直接把存储挂到新机器上,然后在新机器上启动数据库。
 
      操作方法:将老库的pfile(因为里面有指向裸设备的spfile链接),tnsnames.ora,listener.ora,密码文件传到新库的对应位置。将存储切至新机,或者用文件拷贝或dd的方式复制数据文件,启动数据库。
 
      优缺点:优点是该迁移方式非常简单,主要的工作是主机工程师的工作,dba只需配合即可,停机时间为当库、切存储、起库的时间。缺点是要求新老库都是同一平台,是相同的数据库版本。

三、利用data guard迁移:

用dg我们不仅可以用来做容灾,物理的dg我们还可以作为迁移的方式。
        
     操作方法:可见http://www.oracleblog.cn/study-note/dg-created-by-rman/或者http://www.oracleblog.cn/study-note/create-dg-by-rman-one-datafile-by-one-datafile/或者其他相关网文。注意switch over之后,可以将dg拆掉,去掉log_archive_dest_2、FAL_SERVER、FAL_CLIENT、standby_file_management参数。另外还要注意如果用rman做dg,注意手工添加tempfile。
 
     优缺点:优点是停机时间短,停机时间为switch over的时间。缺点:主机必须双份、存储必须双份。

四、用rman做迁移:

rman比较适合于跨文件系统的迁移,如同平台下的不同文件系统。
 
        操作方法:
1.停第三方的归档备份,如legato或dp
2.backup数据库:
  run {
    allocate channel t1 type disk;
    backup full format '$DIR_BAK/UNDOTBS1_2_%d_%s_%p.bak' datafile 2;
    ……
    release channel t1;
      }
3.备份控制文件alter database backup controlfile to '/tmp/mydb.ctl';并到新数据库用rman恢复:restore controlfile from '/arch/sd168.ctl';
4.restore备份文件:
run {
    allocate channel t1 type disk;
    restore datafile 2;
    releasechannel t1;
    }
5.传归档日志,并且对归档进行做recover:recover database until sequence = 归档的序号 thread = 1;
6.对数据库open resetlogs:RMAN>sql 'alter database open resetlogs';
7.alter tablespace temp add tempfile 'XXXXXX' size XXM reuse;
 
      优缺点:优点是可以跨文件系统,停机时间少。缺点是要时刻关注这归档日志,做recover的时候一个都不能少!

2008-04-21 Mon

19:28 SixApart Acquires Social Media Agency Apperceptive » Fenng's shared items in Google Reader
08:03 mysql 在国外facebook,flickr等公司内的使用情况 » Fenng's shared items in Google Reader
06:35 早睡早起 » eagle's home
05:49 浪潮之巅第八章 没落的贵族—摩托罗拉(六) » Fenng's shared items in Google Reader
01:45 Social Environment in Central China » Fenng's shared items in Google Reader
01:28 提高mysql memory(heap) engine内存性能的开源补丁 » Fenng's shared items in Google Reader
01:00 首届杭州·西湖现代音乐节 » Fenng's shared items in Google Reader
00:53 Resin、Apache、PHP与404错误 » Oracle Life
00:01 系统管理员工具包: 更改 UID 和 GID » developerWorks : AIX 专区的文章,教程

2008-04-20 Sun

23:05 MySQL and plugin binaries » MySQL Performance Blog
22:13 表空间管理 » Alibaba DBA Team
19:46 爸妈的来信 » Chanel [K]
18:30 美丽的京剧 美丽的晋剧 » Oracle Life
17:12 保证早起的六个习惯 » 生活帮-LifeBang
16:31 一次NBU备份错误诊断 » yangtingkun
15:15 两颗安定还放不倒自己 » OracleDBA Blog---请享受无法回避的痛苦!