2015-11-15

physical reads prefetch warmup

Oracle

今更なネタですがメモ。
Oracle Database (10.1以降)でバッファキャッシュがスカスカの場合、アクセスパスが INDEX UNIQUE/RANGE SCAN でも db file scattered read でマルチブロックリードすることがある。
これはブロックにアクセスするついでに近くのブロックもそのうちアクセスされると予測してバッファキャッシュに乗せて、以降の処理を高速化する意図があると思われる。

まとめ

INDEX UNIQUE/RANGE SCAN などで db file sequential read でなく db file scattered read が発生している
SQLトレース、V$SQLSTATS、DBA_HIST_SQLSTAT などで、physical reads より logical reads が少ない(そのSQLで使わないブロックを physical read している)
V$SESSTAT、V$SYSSTAT、DBA_HIST_SYSSTAT などで以下の統計値がカウントアップされる
- physical read total multi block requests: マルチブロックリードが発生している
- physical reads cache prefetch: マルチブロックリードでプリフェッチが発生している
- physical reads prefetch warmup: バッファキャッシュのウォームアップのため、マルチブロックリードでプリフェッチが発生している
_db_cache_pre_warm で制御できる

参考

2015-11-15

disk_asynch_io=false ならASMに対して同期I/O(pwrite)になる

Oracle Linux

As you know ASM is doing non (operating system) buffered I/O (also known as ‘DIO’ or Direct I/O) regardless of the oracle database filesystemio_options parameter.
But what’s about : Asynchronous/Synchronous I/O ?
If you have a look to MOS note [ID 751463.1] you’ll see that ASM asynchronous/synchronous I/O is entirely controlled by the DISK_ASYNCH_IO parameter and not the FILESYSTEMIO_OPTIONS one.
At the time being, this note only deals with 10.2 databases, so I want to check if this is still the case with 11.2 databases (Let me tell you than I hope so ;-) ) :

...
Conclusion :
With 11.2 databases, ASM asynchronous/synchronous I/O is still entirely controlled by the DISK_ASYNCH_IO parameter
ASM Asynchronous or Synchronous I/O – bdt's blog

@wrcsus4 12cでも同じ。
— Hiroshi Sekiguchi  (@discus_hamburg) 2015, 11月 24

まとめると、Oracle Database(10.2-12.1) on Linux でデータベースファイルを ASM に置いている場合、

filesystemio_options は非同期I/O(io_submit and io_getevents or 同期I/O(pwrite)とは関係ない
disk_asynch_io=true(デフォルト)なら、非同期I/O(io_submit and io_getevents)だが、false なら同期I/O(pwrite)になる

ということが書かれています。

ここでの同期 or 非同期は O_SYNC or not の話ではありません。Oracle Database は I/O がロストしないよう常に O_SYNC で I/O を行います。

2015-11-15

RHEL6.4(kernel 2.6.32-303)以降の vm.swappiness=0 と OOM Killer の関係

Linux

RHEL6.4(kernel 2.6.32-303)以降、vm.swappiness=0 にすると OOM Killer が発動しやすくなるので、1 にしましょうという話を見かけるのでメモ。詳しくは後日調べる予定。

Deploying Oracle Database 12c on Red Hat Enterprise Linux 6 Best Practices

Warning: Since Red Hat
Enterprise Linux 6.4, setting swappiness to 0 will even more aggressively avoid swapping out, which increases the risk of out-of-memory (OOM) killing under strong memory and I/O pressure. To achieve the same behavior of swappiness as previous versions of Red Hat Enterprise Linux 6.4 in which the recommendation was to set swappiness to 0, set swappiness to the value of 1. The recommendation of swappiness for Red Hat Enterprise Linux 6.4 or higher running Oracle databases is now the value of 1.

This obviously changed the way we think about “vm.swappiness=0”. Previously, setting this to 0 was thought to reduce the tendency to swap userland processes but not disable that completely. As such it was expected to see little swapping instead of OOM.
This applies to all RHEL/CentOS kernels > 2.6.32-303 and to other distributions that provide newer kernels such as Debian and Ubuntu. Or any other distribution where this change has been backported as in RHEL.
https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/

https://access.redhat.com/ja/node/16476

2015-11-15

Linux のページ回収まわりのカーネルパラメータ

Linux

Linux(kernel 2.6.32-303 以降)のDBサーバでメモリ16GB、スワップ領域16GBの場合、ざっくりこんな感じが良いかなという妄想メモ。

vm.swapiness=1
vm.overcommit_memory=2
vm.overcommit_ratio=80
vm.min_free_kbytes=524288
vm.extra_free_kbytes=1048576(kernel 3.5以降)

vm.swappiness=1 でページアウトよりページキャッシュ解放を優先させる。kernel 2.6.32-303 以降、0 にすると OOM Killer が発動しやすくなるらしいので、1 にする。
vm.overcommit_memory=2 でオーバーコミットしないようにして、OOM Killer が発動しにくくする
vm.overcommit_ratio=80 で仮想メモリ割当をメモリサイズ + スワップ領域のサイズ * 80% にする。
vm.min_free_bytes をデフォルト(動的に導出される)より大きめに設定し、余裕を持ってページ回収する。
- 512MB に設定すると、空きメモリが640MB(low pages)を下回ると kswapd がページ回収を開始し、768MB(high pages)を超えるとやめる。空きメモリが 512MB を下回るとプロセスがメモリ要求時に同期でページ回収が実行される(direct reclaim)。
kernel 3.5 以降なら、vm.extra_free_kbytes を設定して、low pages、high pages に 1GB 加算し direct reclaim が発生しにくくする。
メモリ使用率監視閾値(アラート)が 90% なら、low pages がメモリの 10%+α くらいにすると良さげな気がする。

と Solaris テイストになりました。
80%、512MB、1GB は割と適当です。

ページ回収まわりでは zone_reclaim_mode も気になるところ。

参考

Systems Performance: Enterprise and the Cloud

作者: Brendan Gregg
出版社/メーカー: Prentice Hall
発売日: 2013/10/26
メディア: ペーパーバック
この商品を含むブログを見る

ablog

不器用で落着きのない技術者のメモ

physical reads prefetch warmup

まとめ

参考

disk_asynch_io=false ならASMに対して同期I/O(pwrite)になる

RHEL6.4(kernel 2.6.32-303)以降の vm.swappiness=0 と OOM Killer の関係

Linux のページ回収まわりのカーネルパラメータ

関連

参考