Hazelcast--Map資料類型中文版之前篇

4.1.Map

4.1.1.概要：

Hazelcast Map(即IMap)繼承java.util.Map引用java.util.concurrent.ConcurrentMap接口.簡單來說,它是java Map的一種分布式實作.

IMap的一般操作,比如說進行讀/寫時,與我們常見的map的讀寫方法一樣,IMap定義的讀/寫方法也為Get和Put方法.

分布式的Map是怎樣工作的呢？

Hazelcast會将你的Map鍵值對集合,差不多平均的分離至所有的Hazelcast的成員中.每個成員攜帶近似"(1/n * total-data) + backups",n為cluster中的節點數量.

為了幫助大家更好的了解,接下來我們建立一個Hazelcast 執行個體(即節點)然後建立一個名為Capitals的map,鍵值對參考以下代碼:

public class FillMapMember {
  public static void main( String[] args ) { 
    HazelcastInstance hzInstance = Hazelcast.newHazelcastInstance();
    Map<String, String> capitalcities = hzInstance.getMap( "capitals" ); 
    capitalcities.put( "1", "Tokyo" );
    capitalcities.put( "2", "Paris” );
    capitalcities.put( "3", "Washington" );
    capitalcities.put( "4", "Ankara" );
    capitalcities.put( "5", "Brussels" );
    capitalcities.put( "6", "Amsterdam" );
    capitalcities.put( "7", "New Delhi" );
    capitalcities.put( "8", "London" );
    capitalcities.put( "9", "Berlin" );
    capitalcities.put( "10", "Oslo" );
    capitalcities.put( "11", "Moscow" );
    ...
    ...
    capitalcities.put( "120", "Stockholm" )
  }
}

當你運作這段代碼的時候,将會建立一個節點并且在此節點上建立一個map,此節點将被添加至節點集合,該集合為分布式的.

下面這幅圖可以很形象的說明此段代碼的運作效果,現在我們有了一個獨立的cluster節點啦!!

Hazelcast--Map資料類型中文版之前篇

NOTE: Please note that some of the partitions will not contain any data entries since we only have 120 objects and the partition count is 271 by default. This count is configurable and can be changed using the system property

hazelcast.partition.count

. Please see Advanced Configuration Properties.

接下來我們建立第二個cluster節點.此處備份的資料也會被建立.請注意關于備份這部分知識我們将在Hazelcast Overview這節詳細講解.

好啦話不多說,我們運作剛才的代碼進行第二個節點的建立吧.下面是兩個節點的示意圖,詳細的展示了資料與備份資料存儲的方式,顯而易見備份資料是分布式的喔~

Hazelcast--Map資料類型中文版之前篇

如你所見,當一個新成員加入cluster中時,它将會承擔部分資料的備份責任.最終,它将攜帶大約"(1/n

total-data) + backups"的備份資料,進而減少其他節點的負載.

HazelcastInstance::getMap實際上将會傳回一個繼承自

java.util.concurrent.ConcurrentMap

的com.hazelcast.core.IMap執行個體.

  有些方法像ConcurrentMap.putIfAbsent(key,value)、

ConcurrentMap.replace(key,value),可以用于分布式map中,下面我們看一個例子:

import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import java.util.concurrent.ConcurrentMap;

HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();

Customer getCustomer( String id ) {
    ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap( "customers" );
    Customer customer = customers.get( id );
    if (customer == null) {
        customer = new Customer( id );
        customer = customers.putIfAbsent( id, customer );
    }
    return customer;
}               

public boolean updateCustomer( Customer customer ) {
    ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap( "customers" );
    return ( customers.replace( customer.getId(), customer ) != null );            
}

public boolean removeCustomer( Customer customer ) {
    ConcurrentMap<String, Customer> customers = hazelcastInstance.getMap( "customers" );
    return customers.remove( customer.getId(), customer );           
}

所有ConcurrentMap的操作,比如說put和remove操作,在key被其他本地或遠端JVM線程鎖住時都将進行等待,可是它們終将傳回成功. ConcurrentMap操作永遠不會抛出

java.util.ConcurrentModificationException

Also see:

Data Affinity.
Map Configuration with wildcards.

4.1.2.Map的備份

Hazelcast将map分布式的存儲于對個JVM上(cluster members).每個JVM将會hold住一部分資料,當JVM發生崩潰時,資料将不會丢失.

分布式的Map在一個成員崩潰時,擁有一個該成員資料的備份,進而,此次崩潰将不會對資料産生丢失.備份操作是同步的,當map.put(key, value)操作傳回時,它将保證此次操作會在另一個節點上進行重複操作,進而在另一個節點上進行資料的備份.對于讀取來說,它将保證

map.get(key)傳回最新的鍵值對.請記住分布式map的鍵值對是嚴格一緻地.

同步備份

為了保證資料安全,Hazelcast允許你指定備份的數量.當你指定數量後JVM将資料拷貝至其他的JVM,具體配置備份數量請使用backup-count标簽.

<hazelcast>
  <map name="default">
    <backup-count>1</backup-count>
  </map>
</hazelcast>

當數量是1時,意味着有它的資料備份在另一個cluster節點上.當設定為2時,它的資料備份在另兩個節點上.當然它也可以設定為0,當設定為0時,将不會備份資料.

比如說,當性能要求比資料備份要求高時.最大備份數量上限是6.

Hazelcast支援同步備份也支援異步備份.預設的備份方式為同步備份方式. (configured with

backup-count

).在這種情況下,備份操作将執行阻塞操作,即隻有當上一個備份操作傳回成功資訊時才會執行下一個備份操作(此處删除也同樣适用).是以,在put操作時,你要确認你的備份已經被更新.當然,同步備份操作的阻塞問題,将帶來一些潛在的問題及消耗.

異步備份

異步備份,從另一方面來說,它将不會進行阻塞操作.異步備份将不會要求傳回确認資訊(備份操作将在一些時間點執行).異步備份的配置,請使用async-backup-count标簽.

<hazelcast>
  <map name="default">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
  </map>
</hazelcast>

Hazelcast--Map資料類型中文版之前篇

NOTE: Backups increase memory usage since they are also kept in memory. So for every backup, you double the original memory consumption.

Hazelcast--Map資料類型中文版之前篇

NOTE: A map can have both sync and aysnc backups at the same time.

備份資料的讀取

預設情況下,Hazelcast擁有一份同步備份資料.如果備份數量大于1時每個成員将儲存本身的鍵值資料以及其他成員的備份資料.是以對于調用 map.get(key) 方法時,它有可能調用該成員在本成員中已經備份的key, map.get(key) 将會讀取實際上擁有該鍵的成員的值進而來保證資料的一緻性.如果将

read-backup-data 設定為true,那麼它有可能直接從本成員中讀取其他成員備份在此處的資料.

為了增強資料的一緻性,

read-backup-data的預設值是false.将此值設定為true将增強讀取的性能.

<hazelcast>
  <map name="default">
    <backup-count>0</backup-count>
    <async-backup-count>1</async-backup-count>
    <read-backup-data>true</read-backup-data>
  </map>
</hazelcast>

此處說的特性,當且僅當有至少1個同步或異步備份的情況下可用喔.

4.1.3.剔除(Eviction)

除非你從map人工删除資料或使用剔除政策,否則他們将會遺留在map中.Hazelcast支援分布式map的基于政策的剔除.一般的政策為LRU(Least Recently Used)以及LFU (Least Frequently Used).

以下是另外的一些配置聲明：

<hazelcast>
  <map name="default">
    ...
    <time-to-live-seconds>0</time-to-live-seconds>
    <max-idle-seconds>0</max-idle-seconds>
    <eviction-policy>LRU</eviction-policy>
    <max-size policy="PER_NODE">5000</max-size>
    <eviction-percentage>25</eviction-percentage>
    ...
  </map>
</hazelcast>

接下來,我們說明一下各個配置.

time-to-live : Maximum time in seconds for each entry to stay in the map. If it is not 0, entries that are older than and not updated for this time are evicted automatically. Valid values are integers between 0 and Integer.MAX VALUE . Default value is 0 and it means infinite. Moreover, if it is not 0, entries are evicted regardless of the set eviction-policy .
max-idle-seconds : Maximum time in seconds for each entry to stay idle in the map. Entries that are idle for more than this time are evicted automatically. An entry is idle if no get , put or containsKey is called. Valid values are integers between 0 and Integer.MAX VALUE . Default value is 0 and it means infinite.
eviction-policy : Valid values are described below.
- NONE: Default policy. If set, no items will be evicted and the property max-size will be ignored. Of course, you still can combine it with time-to-live-seconds and max-idle-seconds .
- LRU: Least Recently Used.
- LFU: Least Frequently Used.
max-size : Maximum size of the map. When maximum size is reached, map is evicted based on the policy defined. Valid values are integers between 0 and Integer.MAX VALUE . Default value is 0. If you want max-size to work, eviction-policy property must be set to a value other than NONE. Its attributes are described below.
- PER_NODE : Maximum number of map entries in each JVM. This is the default policy. <max-size policy="PER_NODE">5000</max-size>
- PER_PARTITION : Maximum number of map entries within each partition. Storage size depends on the partition count in a JVM. So, this attribute may not be used often. If the cluster is small it will be hosting more partitions and therefore map entries, than that of a larger cluster. <max-size policy="PER_PARTITION">27100</max-size>
- USED_HEAP_SIZE : Maximum used heap size in megabytes for each JVM. <max-size policy="USED_HEAP_SIZE">4096</max-size>
- USED_HEAP_PERCENTAGE : Maximum used heap size percentage for each JVM. If, for example, JVM is configured to have 1000 MB and this value is 10, then the map entries will be evicted when used heap size exceeds 100 MB. <max-size policy="USED_HEAP_PERCENTAGE">10</max-size>
eviction-percentage : When max-size is reached, specified percentage of the map will be evicted. If 25 is set for example, 25% of the entries will be evicted. Setting this property to a smaller value will cause eviction of small number of map entries. So, if map entries are inserted frequently, smaller percentage values may lead to overheads. Valid values are integers between 0 and 100. Default value is 25.

剔除配置樣本

<map name="documents">
  <max-size policy="PER_NODE">10000</max-size>
  <eviction -policy>LRU</eviction -policy> 
  <max-idle-seconds>60</max-idle-seconds>
</map>

在此樣本中,documents map将在大小超過10000時開始剔除資料操作.剔除操作進行剔除的是最少使用到的資料.具體的是剔除超過60秒未被使用的資料.

剔除鍵值對資料特性

通過上述剔除政策的解讀我們發現,通過配置可以适用于整個map的資料.滿足條件的資料将會被剔除.

但是當你想剔除特定的資料時你該怎麼辦呢？在這個例子中,你可以在調用map.put()方法時,使用ttl以及timeunit參數來手動設定這個鍵值對的剔除操作.下面給出本操作的代碼.

myMap.put( "1", "John", 50, TimeUnit.SECONDS )

此處實作的效果是,當鍵“1”放入myMap時,将在50s後被剔除.

剔除所有鍵值對

調用evictAll()方法将剔除map中除了上鎖的鍵值對以外的所有鍵值對.如果一個map中定義了MapStore,那麼調用evictAll()方法時将不會調用deleteAll方法.如果你希望deleteAll方法,請調用clear()方法.

下面給出一個例子~~：

public class EvictAll {

    public static void main(String[] args) {
        final int numberOfKeysToLock = 4;
        final int numberOfEntriesToAdd = 1000;

        HazelcastInstance node1 = Hazelcast.newHazelcastInstance();
        HazelcastInstance node2 = Hazelcast.newHazelcastInstance();

        IMap<Integer, Integer> map = node1.getMap(EvictAll.class.getCanonicalName());
        for (int i = 0; i < numberOfEntriesToAdd; i++) {
            map.put(i, i);
        }

        for (int i = 0; i < numberOfKeysToLock; i++) {
            map.lock(i);
        }

        // should keep locked keys and evict all others.
        map.evictAll();

        System.out.printf("# After calling evictAll...\n");
        System.out.printf("# Expected map size\t: %d\n", numberOfKeysToLock);
        System.out.printf("# Actual map size\t: %d\n", map.size());

    }
}

執行效果如下：

# After calling evictAll...
# Expected map size	: 4
# Actual map size	: 4

Hazelcast--Map資料類型中文版之前篇

NOTE: Only EVICT_ALL event is fired for any registered listeners.

後續章節敬請關注.

關于翻譯的一點說明：僅作為學習交流之用.如有錯誤,請大家指出,謝謝！

---------------------------------------------------------------------------------------------------------------------------------------------

英文文檔：http://docs.hazelcast.org/docs/3.3/manual/html-single/hazelcast-documentation.html

Hazelcast--Map資料類型中文版之前篇

繼續閱讀

關于Gradle配置的小結

Java小案例——随機數猜測随機數猜測

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method