Amazon Redshift の ra3 インスタンスのローカルSSDのキャッシュについて
Compute nodes use large, high performance SSDs as local caches. Redshift leverages workload patterns and techniques such as automatic fine-grained data eviction and intelligent data prefetching, to deliver the perfor-mance of local SSD while scaling storage automatically to Amazon S3.Figure 5 shows the key components of RMS extending from in-memory caches to committed data on Amazon S3. Snapshots of data on Amazon S3 act as logical restore points for the customer. Redshift supports both the restore of a complete cluster, as well as of specific tables, from any available restore point. Amazon S3 is also the data conduit and source of truth for data sharing and machine learning. RMS accelerates data accesses from S3 by using a prefetching scheme that pulls data blocks into memory and caches them to local SSDs. RMS tunes cache replacement to keep the relevant blocks locally available by tracking accesses to every block. This information is also used to help customers decide if scaling up their cluster would be beneficial. RMS makes in-place cluster resizing a pure metadata operation since compute nodes are practically stateless and always have access to the data blocks in RMS. RMS is metadata bound and easy to scale since data can be ingested directly into Amazon S3. The tiered nature of RMS where SSDs act as cache makes swapping out of hardware convenient.
Amazon Redshift re-invented - Amazon Science
RMS-supported Redshift RA3 instances provide up to 16PBs of capacity today. The in-memory disk-cache size can be dynamically changed for balancing performance and memory needs of queries.