天天看點

MPI Maelstrom (POJ-1502)

MPI Maelstrom (POJ-1502)

BIT has recently taken delivery of their new supercomputer, a 32 processor Apollo Odyssey distributed shared memory machine with a hierarchical communication subsystem. Valentine McKee's research advisor, Jack Swigert, has asked her to benchmark the new system. 

``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.'' 

``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked. 

``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.'' 

``Is there anything you can do to fix that?'' 

``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.'' 

``Ah, so you can do the broadcast as a binary tree!'' 

``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''

Input

The input will describe the topology of a network connecting n processors. The first line of the input will be n, the number of processors, such that 1 <= n <= 100. 

The rest of the input defines an adjacency matrix, A. The adjacency matrix is square and of size n x n. Each of its entries will be either an integer or the character x. The value of A(i,j) indicates the expense of sending a message directly from node i to node j. A value of x for A(i,j) indicates that a message cannot be sent directly from node i to node j. 

Note that for a node to send a message to itself does not require network communication, so A(i,i) = 0 for 1 <= i <= n. Also, you may assume that the network is undirected (messages can go in either direction with equal overhead), so that A(i,j) = A(j,i). Thus only the entries on the (strictly) lower triangular portion of A will be supplied. 

The input to your program will be the lower triangular section of A. That is, the second line of input will contain one entry, A(2,1). The next line will contain two entries, A(3,1) and A(3,2), and so on.

Output

Your program should output the minimum communication time required to broadcast a message from the first processor to all the other processors.

Sample Input

5
50
30 5
100 20 50
10 x x 10
           

Sample Output

35
           

題目翻譯:

BIT最近已經傳遞了他們的新超級計算機,一台32處理器的阿波羅奧德賽分布式共享記憶體機,帶有一個分層通信子系統。瓦倫丁·麥基的研究顧問傑克·斯威格特(Jack Swigert)要求她對新系統進行基準測試。

瓦倫丁對斯威格特說:“由于‘阿波羅’是一台分布式共享記憶體機器,記憶體通路和通信時間并不統一。”共享同一記憶體子系統的處理器之間的通信速度很快,但不屬于同一子系統的處理器之間的通信速度較慢。阿波羅和我們實驗室裡的機器之間的通訊速度更慢了。”

“阿波羅的消息傳遞接口(MPI)端口是如何工作的?””Swigert問道。

”“不太好,”瓦朗蒂娜回答。' '要将消息從一個處理器廣播到所有其他的n-1處理器,它們隻需執行n-1發送序列。這真的會把事情序列化,降低性能。”

’‘你能做些什麼來補救嗎?’

“是的,”笑了情人節。”“有。一旦第一個處理器将消息發送給另一個處理器,這兩個處理器就可以同時将消息發送給另外兩個主機。然後會有四個主機可以發送,以此類推。”

“啊,是以你可以用二叉樹來做廣播!”

“并不是真正的二叉樹——我們的網絡有一些特殊的特性,我們應該加以利用。”我們擁有的接口卡允許每個處理器同時向連接配接到它的任意數量的其他處理器發送消息。然而,消息不一定同時到達目的地——這涉及到通信成本。一般來說,我們需要考慮到網絡拓撲結構中每個連結的通信成本,并相應地計劃将廣播所需的總時間降到最低。”

輸入

輸入将描述連接配接n個處理器的網絡的拓撲結構。輸入的第一行将是n,處理器的數量,這樣1 <= n <= 100。

其餘的輸入定義了一個鄰接矩陣,鄰接矩陣A是廣場和大小n x n。它的每個條目将一個整數或字元x。(i, j)表示的價值發送消息的費用直接從節點到節點j。一個x的值(i, j)表明,一個消息不能發送直接從節點到節點j。

注意,對于一個節點發送消息給它自己并不需要網絡通信,是以對于1 <= i <= n,則a (i,i) = 0。是以,隻提供(嚴格地)A的下三角部分上的項。

程式的輸入将是A的下三角部分,也就是說,第二行輸入将包含一個條目A(2,1)。下一行将包含兩個條目,A(3,1)和A(3,2),依此類推。

輸出

程式應該輸出從第一個處理器向所有其他處理器廣播消息所需的最小通信時間。

思路和題目大意:

AC代碼:

#include <stdio.h>
#include <algorithm>
#include <iostream>
#include <math.h>
#include <string.h>
#include <queue>
#include <stack>
#include <map>
#include <vector>
#include <stdlib.h>
#define inf 0x3f3f3f3f
#define PI 3.1415926
#define MAX 100
using namespace std;
int dp[MAX][MAX];
int dis[MAX],vis[MAX];
int n;
void dijstra()
{
    int i,j,minn,k;
    for(i=1; i<=n; i++)
    {
        dis[i]=inf;
        vis[i]=0;
    }
    dis[1]=0;
    for(i=1; i<=n; i++)
    {
        minn=inf;
        for(j=1; j<=n; j++)
        {
            if(vis[j]==0 && dis[j]<minn)
            {
                minn=dis[j];
                k=j;
            }
        }
        vis[k]=1;
        for(j=1; j<=n; j++)
            if(vis[j]==0 && dis[k]+dp[k][j]<dis[j])
                dis[j]=dis[k]+dp[k][j];
    }
}
int main()
{
    char s[10];
    int i,j;
    int ans=-1;
    while(~scanf("%d",&n))
    {
        for(i=1; i<=n; i++)
            for(j=1; j<=n; j++)
                if(i!=j)
                    dp[i][j]=inf;
                else
                    dp[i][j]=0;
        for(i=2; i<=n; i++)
            for(j=1; j<i; j++)
            {
                scanf("%s",s);
                if(s[0]!='x')
                    dp[i][j]=dp[j][i]=atoi(s);
            }
        dijstra();
        for(i=2; i<=n; i++)
            if(dis[i]>ans)
                ans=dis[i];
        printf("%d\n",ans);
    }
    return 0;
}