2011-10-29
■ cassandra1.0.0のconf/schema-sample.txtを読み解く
/*This file contains an example Keyspace that can be created using the cassandra-cli command line interface as follows. bin/cassandra-cli -host localhost --file conf/schema-sample.txt The cassandra-cli includes online help that explains the statements below. You can accessed the help without connecting to a running cassandra instance by starting the client and typing "help;" */
WARNING: [{}] strategy_options syntax is deprecated, please use {}
Line 3 => No enum const class org.apache.cassandra.cli.CliClient$ColumnFamilyArgument.MEMTABLE_FLUSH_AFTER
schema-sample.txtに2つバグがあるので修正。
[{replication_factor:1}] → {replication_factor:1}
and memtable_flush_after = 59 → /* and memtable_flush_after = 59 */
create keyspace Keyspace1
with strategy_options={replication_factor:1}
and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';
use Keyspace1;
keyspaceはネームスペースみたいなもの。mysqlで言うならdatabase名。
replication_factorはレプリカの数。
placement_strategyはレプリカの配置の仕方のアルゴリズム選択。
これらは、keyspaceごとに指定するんですね。
% bin/cassandra-cli -host localhost -port 9160
Connected to: "TakeruCluster001" on localhost/9160
Welcome to the Cassandra CLI.
Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.
[default@unknown] help create keyspace;
create keyspace <keyspace>;
create keyspace <keyspace> with <att1>=<value1>;
create keyspace <keyspace> with <att1>=<value1> and <att2>=<value2> ...;
Create a keyspace with the specified attributes.
Required Parameters:
- keyspace: Name of the new keyspace, "system" is reserved for
Cassandra internals. Names may only contain letters, numbers and
underscores.
Keyspace Attributes (all are optional):
- placement_strategy: Class used to determine how replicas
are distributed among nodes. Defaults to NetworkTopologyStrategy with
one datacenter defined with a replication factor of 1 ("[datacenter1:1]").
Supported values are:
- org.apache.Cassandra.locator.SimpleStrategy
- org.apache.Cassandra.locator.NetworkTopologyStrategy
- org.apache.Cassandra.locator.OldNetworkTopologyStrategy
SimpleStrategy merely places the first replica at the node whose
token is closest to the key (as determined by the Partitioner), and
additional replicas on subsequent nodes along the ring in increasing
Token order.
Supports a single strategy option 'replication_factor' that
specifies the replication factor for the cluster.
With NetworkTopologyStrategy, for each datacenter, you can specify
how many replicas you want on a per-keyspace basis. Replicas are
placed on different racks within each DC, if possible.
Supports strategy options which specify the replication factor for
each datacenter. The replication factor for the entire cluster is the
sum of all per datacenter values. Note that the datacenter names
must match those used in conf/cassandra-topology.properties.
OldNetworkToplogyStrategy [formerly RackAwareStrategy]
places one replica in each of two datacenters, and the third on a
different rack in in the first. Additional datacenters are not
guaranteed to get a replica. Additional replicas after three are
placed in ring order after the third without regard to rack or
datacenter.
Supports a single strategy option 'replication_factor' that
specifies the replication factor for the cluster.
- strategy_options: Optional additional options for placement_strategy.
Options have the form {key:value}, see the information on each
strategy and the examples.
- durable_writes: When set to false all RowMutations on keyspace will by-pass CommitLog.
Set to true by default.
Examples:
create keyspace Keyspace2
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = {replication_factor:4};
create keyspace Keyspace3
with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
and strategy_options={DC1:2, DC2:2};
create keyspace Keyspace4
with placement_strategy = 'org.apache.cassandra.locator.OldNetworkTopologyStrategy'
and strategy_options = {replication_factor:1};
create column family Standard1
with comparator = BytesType
and keys_cached = 10000
and rows_cached = 1000
and row_cache_save_period = 0
and key_cache_save_period = 3600
and memtable_throughput = 255
and memtable_operations = 0.29;
keyspaceの中に"column family"をつくる。mysqlで言えばtable。
comparator: "column"の"name"の型と比較順を決める。
memtable_*: この設定はなくなったのではないか?
[default@unknown] help create column family;
create column family <name>;
create column family <name> with <att1>=<value1>;
create column family <name> with <att1>=<value1> and <att2>=<value2>...;
Create a column family in the current keyspace with the specified
attributes.
Required Parameters:
- name: Name of the new column family. Names may only contain letters,
numbers and underscores.
column family Attributes (all are optional):
- column_metadata: Defines the validation and indexes for known columns in
this column family.
Columns not listed in the column_metadata section will use the
default_validator to validate their values.
Column Required parameters:
- name: Binds a validator (and optionally an indexer) to columns
with this name in any row of the enclosing column family.
- validator: Validator to use for values for this column.
Supported values are:
- AsciiType
- BytesType
- CounterColumnType (distributed counter column)
- Int32Type
- IntegerType (a generic variable-length integer type)
- LexicalUUIDType
- LongType
- UTF8Type
It is also valid to specify the fully-qualified class name to a class
that extends org.apache.Cassandra.db.marshal.AbstractType.
Column Optional parameters:
- index_name: Name for the index. Both an index name and
type must be specified.
- index_type: The type of index to be created.
Suported values are:
- KEYS: a ColumnFamily backed index
- CUSTOM: a user supplied index implementaion. You must supply a
'class_name' field in the index_options with the full classname
of the implementation.
- index_options: Optional additional options for index_type.
Options have the form {key:value}.
- column_type: Type of columns this column family holds, valid values are
Standard and Super. Default is Standard.
- comment: Human readable column family description.
- comparator: Validator to use to validate and compare column names in
this column family. For Standard column families it applies to columns, for
Super column families applied to super columns. Also see the subcomparator
attribute. Default is BytesType, which is a straight forward lexical
comparison of the bytes in each column.
Supported values are:
- AsciiType
- BytesType
- CounterColumnType (distributed counter column)
- Int32Type
- IntegerType (a generic variable-length integer type)
- LexicalUUIDType
- LongType
- UTF8Type
It is also valid to specify the fully-qualified class name to a class that
extends org.apache.Cassandra.db.marshal.AbstractType.
- default_validation_class: Validator to use for values in columns which are
not listed in the column_metadata. Default is BytesType which applies
no validation.
Supported values are:
- AsciiType
- BytesType
- CounterColumnType (distributed counter column)
- Int32Type
- IntegerType (a generic variable-length integer type)
- LexicalUUIDType
- LongType
- UTF8Type
It is also valid to specify the fully-qualified class name to a class that
extends org.apache.Cassandra.db.marshal.AbstractType.
- key_validation_class: Validator to use for keys.
Default is BytesType which applies no validation.
Supported values are:
- AsciiType
- BytesType
- Int32Type
- IntegerType (a generic variable-length integer type)
- LexicalUUIDType
- LongType
- UTF8Type
It is also valid to specify the fully-qualified class name to a class that
extends org.apache.Cassandra.db.marshal.AbstractType.
- gc_grace: Time to wait in seconds before garbage collecting tombstone
deletion markers. Default value is 864000 or 10 days.
Set this to a large enough value that you are confident that the deletion
markers will be propagated to all replicas by the time this many seconds
has elapsed, even in the face of hardware failures.
See http://wiki.apache.org/Cassandra/DistributedDeletes
- keys_cached: Maximum number of keys to cache in memory. Valid values are
either a double between 0 and 1 (inclusive on both ends) denoting what
fraction should be cached. Or an absolute number of rows to cache.
Default value is 200000.
Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the
minimum, sometimes more. The key cache is fairly tiny for the amount of
time it saves, so it's worthwhile to use it at large numbers all the way
up to 1.0 (all keys cached). The row cache saves even more time, but must
store the whole values of its rows, so it is extremely space-intensive.
It's best to only use the row cache if you have hot rows or static rows.
- key_cache_save_period: Duration in seconds after which Cassandra should
safe the keys cache. Caches are saved to saved_caches_directory as
specified in conf/Cassandra.yaml. Default is 14400 or 4 hours.
Saved caches greatly improve cold-start speeds, and is relatively cheap in
terms of I/O for the key cache. Row cache saving is much more expensive and
has limited use.
- read_repair_chance: Probability (0.0-1.0) with which to perform read
repairs for any read operation. Default is 0.1.
Note that disabling read repair entirely means that the dynamic snitch
will not have any latency information from all the replicas to recognize
when one is performing worse than usual.
- rows_cached: Maximum number of rows whose entire contents we
cache in memory. Valid values are either a double between 0 and 1 (
inclusive on both ends) denoting what fraction should be cached. Or an
absolute number of rows to cache. Default value is 0, to disable row
caching.
Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the
minimum, sometimes more. The key cache is fairly tiny for the amount of
time it saves, so it's worthwhile to use it at large numbers all the way
up to 1.0 (all keys cached). The row cache saves even more time, but must
store the whole values of its rows, so it is extremely space-intensive.
It's best to only use the row cache if you have hot rows or static rows.
- row_cache_save_period: Duration in seconds after which Cassandra should
safe the row cache. Caches are saved to saved_caches_directory as specified
in conf/Cassandra.yaml. Default is 0 to disable saving the row cache.
Saved caches greatly improve cold-start speeds, and is relatively cheap in
terms of I/O for the key cache. Row cache saving is much more expensive and
has limited use.
- subcomparator: Validator to use to validate and compare sub column names
in this column family. Only applied to Super column families. Default is
BytesType, which is a straight forward lexical comparison of the bytes in
each column.
Supported values are:
- AsciiType
- BytesType
- CounterColumnType (distributed counter column)
- Int32Type
- IntegerType (a generic variable-length integer type)
- LexicalUUIDType
- LongType
- UTF8Type
It is also valid to specify the fully-qualified class name to a class that
extends org.apache.Cassandra.db.marshal.AbstractType.
- max_compaction_threshold: The maximum number of SSTables allowed before a
minor compaction is forced. Default is 32, setting to 0 disables minor
compactions.
Decreasing this will cause minor compactions to start more frequently and
be less intensive. The min_compaction_threshold and max_compaction_threshold
boundaries are the number of tables Cassandra attempts to merge together at
once.
- min_compaction_threshold: The minimum number of SSTables needed
to start a minor compaction. Default is 4, setting to 0 disables minor
compactions.
Increasing this will cause minor compactions to start less frequently and
be more intensive. The min_compaction_threshold and max_compaction_threshold
boundaries are the number of tables Cassandra attempts to merge together at
once.
- replicate_on_write: Replicate every counter update from the leader to the
follower replicas. Accepts the values true and false.
- row_cache_provider: The provider for the row cache to use for this
column family.
Supported values are:
- ConcurrentLinkedHashCacheProvider
- SerializingCacheProvider (requires JNA)
It is also valid to specify the fully-qualified class name to a class
that implements org.apache.cassandra.cache.IRowCacheProvider.
row_cache_provider defaults to SerializingCacheProvider if you have JNA
enabled, otherwise ConcurrentLinkedHashCacheProvider.
SerializingCacheProvider serialises the contents of the row and stores
it in native memory, i.e., off the JVM Heap. Serialized rows take
significantly less memory than "live" rows in the JVM, so you can cache
more rows in a given memory footprint. And storing the cache off-heap
means you can use smaller heap sizes, reducing the impact of GC pauses.
- compression_options: Options related to compression.
Options have the form {key:value}.
The main recognized options are:
- sstable_compression: the algorithm to use to compress sstables for
this column family. If none is provided, compression will not be
enabled. Supported values are SnappyCompressor, DeflateCompressor or
any custom compressor. It is also valid to specify the fully-qualified
class name to a class that implements org.apache.cassandra.io.ICompressor.
- chunk_length_kb: specify the size of the chunk used by sstable
compression (default to 64, must be a power of 2).
To disable compression just set compression_options to null like this
`compression_options = null`.
Examples:
create column family Super4
with column_type = 'Super'
and comparator = 'AsciiType'
and rows_cached = 10000;
create column family Standard3
with comparator = 'LongType'
and rows_cached = 10000;
create column family Standard4
with comparator = AsciiType
and column_metadata =
[{
column_name : Test,
validation_class : IntegerType,
index_type : 0,
index_name : IdxName},
{
column_name : 'other name',
validation_class : LongType
}];
create column family Standard2
with comparator = UTF8Type
and read_repair_chance = 0.1
and keys_cached = 100
and gc_grace = 0
and min_compaction_threshold = 5
and max_compaction_threshold = 31;
create column family StandardByUUID1
with comparator = TimeUUIDType;
read_repair_chance: ?
gc_grace: ?
compaction_*: コンパクションの調整。
create column family Super1
with column_type = Super
and comparator = BytesType
and subcomparator = BytesType;
create column family Super2
with column_type = Super
and subcomparator = UTF8Type
and rows_cached = 10000
and keys_cached = 50
and comment = 'A column family with supercolumns, whose column and subcolumn names are UTF8 strings';
create column family Super3
with column_type = Super
and comparator = LongType
and comment = 'A column family with supercolumns, whose column names are Longs (8 bytes)';
column_type = Superのときは、
comparatorが"super column"の設定で、
subcomparatorが"column"の設定になる。
create column family Indexed1
with comparator = UTF8Type
and default_validation_class = LongType
and column_metadata = [{
column_name : birthdate,
validation_class : LongType,
index_name : birthdate_idx,
index_type : 0}
];
default_validation_class: "value"の型、バリデータのデフォルト。
column_metadata: カラムの設定。indexを張れる!!
index_type: KEYS(a ColumnFamily backed index) or CUSTOM. KEYSは単純なeqクエリができる。
create column family Counter1
with default_validation_class = CounterColumnType;
create column family SuperCounter1
with column_type = Super
and default_validation_class = CounterColumnType;
CounterColumnType (distributed counter column)
これは、なんだろう。
トラックバック - http://d.hatena.ne.jp/urekat/20111029/1319889723
リンク元
- 61 http://www.google.co.jp/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDMQFjAA&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ctbs=lr:lang_1ja&ei=3ntMT9SjPKvLmAWg7-kD&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg&sig2=7zZ7qjtPMcp1zPlRdWTT6w
- 42 https://www.google.co.jp/
- 22 http://www.google.co.jp/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDEQFjAB&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=ZghDT5eyEoSSiQeWmJn6CQ&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg&sig2=jRiyzPrxxRGjENYO3lBdMA
- 17 http://www.google.co.jp/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&cts=1331212967819&ved=0CDQQFjAC&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=gLJYT4z-NsahiQep6rTGDQ&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg&sig2=2F0ZtHXhlL2xOG-avYxqMA
- 16 http://www.google.co.jp/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=1&cts=1331174507398&ved=0CDUQFjAA&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=ZRxYT42VIMHLrQfgweiiDA&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg
- 16 http://www.google.co.jp/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CEEQFjAD&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=v_SZT43UHO7OmAXn0uisDg&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg
- 16 http://www.google.co.jp/url?sa=t&rct=j&q=networktopologystrategy cassandra&source=web&cd=10&ved=0CHwQFjAJ&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=SO68Tu6PHaPvmAW8qeW9BA&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg&sig2=PHsokpFf-VcU
- 9 http://www.google.co.jp/url?sa=t&rct=j&q=placement_strategy&source=web&cd=2&ved=0CCoQFjAB&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=5cLeTu2GM4rfmAX4sMyuBw&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg&sig2=ncPKIadSVfirerThi4kIYw
- 9 http://www.google.co.jp/url?sa=t&rct=j&q=replication_factor&source=web&cd=5&ved=0CEoQFjAE&url=http://d.hatena.ne.jp/urekat/20111029/1319889723&ei=j9TdTrT7Jo_wmAX536SRBQ&usg=AFQjCNENZWcUVznPSPwhHLrBNMEuuYoWEg
- 9 https://www.google.com/
