jemallocを読んでみる(その1)

結局、Cygwinでjemallocを動かす努力はなんか気が乗らないので、jemallocを読んでみようということにしました。色々勉強になりそうです。最初のコメントを訳してみます。英語は苦手なんで誤訳が一杯あると思います。指摘してくださると幸いです。

/* -*- Mode: C; tab-width: 4; c-basic-offset: 4 -*- */
/*-
 * Copyright (C) 2006-2008 Jason Evans <jasone@FreeBSD.org>.
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice(s), this list of conditions and the following disclaimer as
 *    the first lines of this file unmodified other than the possible
 *    addition of one or more copyright notices.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice(s), this list of conditions and the following disclaimer in
 *    the documentation and/or other materials provided with the
 *    distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
 * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
 * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
 * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 *******************************************************************************

ライセンス、再配布の際にはこれを表示するように書いてあるような気がしますので、載せておきます。

 *
 * This allocator implementation is designed to provide scalable performance
 * for multi-threaded programs on multi-processor systems.  The following
 * features are included for this purpose:

このアロケータの実現はマルチプロセッサシステムで動くマルチスレッドプログラムで性能がスケーラブルになるように設計されています。その目的のため次のような特徴があります。

 *
 *   + Multiple arenas are used if there are multiple CPUs, which reduces lock
 *     contention and cache sloshing.
 *

+ 複数のCPUがある時は複数のアリーナ(mallocで割り当てるメモリを切り出してくるための領域)を使います。それは、ロック競合とキャッシュにデータを擦り付ける(cache sloshing)ことを軽減します。

 *   + Cache line sharing between arenas is avoided for internal data
 *     structures.
 *

+ アリーナ間でキャッシュラインが共有しないように内部データ構造が決められています。

 *   + Memory is managed in chunks and runs (chunks can be split into runs),
 *     rather than as individual pages.  This provides a constant-time
 *     mechanism for associating allocations with particular arenas.
 *

+ メモリは、個々のページというより塊(chunks)と列(runs)で管理します。塊は列に分けることが出来ます。この管理方針から、アロケーションと特定のアリーナを関連づける定数時間で行うメカニズムを提供します。

 * Allocation requests are rounded up to the nearest size class, and no record
 * of the original request size is maintained.  Allocations are broken into
 * categories according to size class.  Assuming runtime defaults, 4 kB pages
 * and a 16 byte quantum, the size classes in each category are as follows:
 *

アロケーションの要求はそれに一番近いsize classに丸められます。丸める前の元のサイズはどこにも記憶されません。アロケーション処理はsize classにより分けられます。4kBページ、16byte単位で割り当てるランタイムのデフォルトを仮定すると、size classは次のようになります。

 *   |=====================================|
 *   | Category | Subcategory    |    Size |
 *   |=====================================|
 *   | Small    | Tiny           |       2 |
 *   |          |                |       4 |
 *   |          |                |       8 |
 *   |          |----------------+---------|
 *   |          | Quantum-spaced |      16 |
 *   |          |                |      32 |
 *   |          |                |      48 |
 *   |          |                |     ... |
 *   |          |                |     480 |
 *   |          |                |     496 |
 *   |          |                |     512 |
 *   |          |----------------+---------|
 *   |          | Sub-page       |    1 kB |
 *   |          |                |    2 kB |
 *   |=====================================|
 *   | Large                     |    4 kB |
 *   |                           |    8 kB |
 *   |                           |   12 kB |
 *   |                           |     ... |
 *   |                           | 1012 kB |
 *   |                           | 1016 kB |
 *   |                           | 1020 kB |
 *   |=====================================|
 *   | Huge                      |    1 MB |
 *   |                           |    2 MB |
 *   |                           |    3 MB |
 *   |                           |     ... |
 *   |=====================================|
 *


 * A different mechanism is used for each category:
 *

それぞれのカテゴリーは異なるメカニズムを使います。

 *   Small : Each size class is segregated into its own set of runs.  Each run
 *           maintains a bitmap of which regions are free/allocated.
 *

Small それぞれのsize classはそれぞれの列で分けられます。それぞれの列は「割り当てた」/「割り当てていない」をビットマップで管理します。

 *   Large : Each allocation is backed by a dedicated run.  Metadata are stored
 *           in the associated arena chunk header maps.
 *

Large それぞれのアロケーションは専用の列から行います。メタデータは・・・

ここが訳せませんでした・・・。

 *   Huge : Each allocation is backed by a dedicated contiguous set of chunks.
 *          Metadata are stored in a separate red-black tree.
 *
 *******************************************************************************

Huge それぞれのアロケーションは専用の連続した塊の集まりから行います。メタデータは赤黒木で分けられて格納されます。