Java8的groupingBy實作集合的分組,類似Mysql的group by分組功能,注意得到的是一個map
對集合按照單個屬性分組、分組計數、排序
List<String> items =
Arrays.asList("apple", "apple", "banana",
"apple", "orange", "banana", "papaya");
// 分組
Map<String, List<String>> result1 = items.stream().collect(
Collectors.groupingBy(
Function.identity()
)
);
//{papaya=[papaya], orange=[orange], banana=[banana, banana], apple=[apple, apple, apple]}
System.out.println(result1);
// 分組計數
Map<String, Long> result2 = items.stream().collect(
Collectors.groupingBy(
Function.identity(), Collectors.counting()
)
);
// {papaya=1, orange=1, banana=2, apple=3}
System.out.println(result2);
Map<String, Long> finalMap = new LinkedHashMap<>();
//分組, 計數和排序
result2.entrySet().stream()
.sorted(Map.Entry.<String, Long>comparingByValue().reversed())
.forEachOrdered(e -> finalMap.put(e.getKey(), e.getValue()));
// {apple=3, banana=2, papaya=1, orange=1}
System.out.println(finalMap);

集合按照多個屬性分組
1.多個屬性拼接出一個組合屬性
public static void main(String[] args) {
User user1 = new User("zhangsan", "beijing", 10);
User user2 = new User("zhangsan", "beijing", 20);
User user3 = new User("lisi", "shanghai", 30);
List<User> list = new ArrayList<User>();
list.add(user1);
list.add(user2);
list.add(user3);
Map<String, List<User>> collect = list.stream().collect(Collectors.groupingBy(e -> fetchGroupKey(e)));
//{zhangsan#beijing=[User{age=10, name='zhangsan', address='beijing'}, User{age=20, name='zhangsan', address='beijing'}],
// lisi#shanghai=[User{age=30, name='lisi', address='shanghai'}]}
System.out.println(collect);
}
private static String fetchGroupKey(User user){
return user.getName() +"#"+ user.getAddress();
}
2.嵌套調用groupBy
User user1 = new User("zhangsan", "beijing", 10);
User user2 = new User("zhangsan", "beijing", 20);
User user3 = new User("lisi", "shanghai", 30);
List<User> list = new ArrayList<User>();
list.add(user1);
list.add(user2);
list.add(user3);
Map<String, Map<String, List<User>>> collect
= list.stream().collect(
Collectors.groupingBy(
User::getAddress, Collectors.groupingBy(User::getName)
)
);
System.out.println(collect);
3. 使用 Arrays.asList
Arrays.asList
我有一個與Web通路記錄相關的域對象清單。這些域對象可以擴充到數千個。
我沒有資源或需求将它們以原始格式存儲在資料庫中,是以我希望預先計算聚合并将聚合的資料放在資料庫中。
我需要聚合在5分鐘視窗中傳輸的總位元組數,如下面的sql查詢
select
round(request_timestamp, '5') as window, --round timestamp to the nearest 5 minute
cdn,
isp,
http_result_code,
transaction_time,
sum(bytes_transferred)
from web_records
group by
round(request_timestamp, '5'),
cdn,
isp,
http_result_code,
transaction_time
在java 8中,我目前的第一次嘗試是這樣的,我知道這個解決方案類似于
Group by multiple field names in java 8Map<Date, Map<String, Map<String, Map<String, Map<String, Integer>>>>>>> aggregatedData =
webRecords
.stream()
.collect(Collectors.groupingBy(WebRecord::getFiveMinuteWindow,
Collectors.groupingBy(WebRecord::getCdn,
Collectors.groupingBy(WebRecord::getIsp,
Collectors.groupingBy(WebRecord::getResultCode,
Collectors.groupingBy(WebRecord::getTxnTime,
Collectors.reducing(0,
WebRecord::getReqBytes(),
Integer::sum)))))));
這是可行的,但它是醜陋的,所有這些嵌套的地圖是一個噩夢!要将地圖“展平”或“展開”成行,我必須這樣做
for (Date window : aggregatedData.keySet()) {
for (String cdn : aggregatedData.get(window).keySet()) {
for (String isp : aggregatedData.get(window).get(cdn).keySet()) {
for (String resultCode : aggregatedData.get(window).get(cdn).get(isp).keySet()) {
for (String txnTime : aggregatedData.get(window).get(cdn).get(isp).get(resultCode).keySet()) {
Integer bytesTransferred = aggregatedData.get(window).get(cdn).get(distId).get(isp).get(resultCode).get(txnTime);
AggregatedRow row = new AggregatedRow(window, cdn, distId...
如你所見,這是相當混亂和難以維持。
有誰知道更好的方法嗎?任何幫助都将不勝感激。
我想知道是否有更好的方法來展開嵌套的映射,或者是否有一個庫允許您對集合進行分組。
最佳答案
您應該為地圖建立自定義密鑰。最簡單的方法是使用
Arrays.asList
:
Function<WebRecord, List<Object>> keyExtractor = wr ->
Arrays.<Object>asList(wr.getFiveMinuteWindow(), wr.getCdn(), wr.getIsp(),
wr.getResultCode(), wr.getTxnTime());
Map<List<Object>, Integer> aggregatedData = webRecords.stream().collect(
Collectors.groupingBy(keyExtractor, Collectors.summingInt(WebRecord::getReqBytes)));
在這種情況下,鍵是按固定順序列出的5個元素。不是很面向對象,但很簡單。或者,您可以定義自己的表示自定義鍵的類型,并建立适當的
hashCode
/
equals
實作。
參考連結: