天天看点

MPI Maelstrom (POJ-1502)

MPI Maelstrom (POJ-1502)

BIT has recently taken delivery of their new supercomputer, a 32 processor Apollo Odyssey distributed shared memory machine with a hierarchical communication subsystem. Valentine McKee's research advisor, Jack Swigert, has asked her to benchmark the new system. 

``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.'' 

``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked. 

``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.'' 

``Is there anything you can do to fix that?'' 

``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.'' 

``Ah, so you can do the broadcast as a binary tree!'' 

``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''

Input

The input will describe the topology of a network connecting n processors. The first line of the input will be n, the number of processors, such that 1 <= n <= 100. 

The rest of the input defines an adjacency matrix, A. The adjacency matrix is square and of size n x n. Each of its entries will be either an integer or the character x. The value of A(i,j) indicates the expense of sending a message directly from node i to node j. A value of x for A(i,j) indicates that a message cannot be sent directly from node i to node j. 

Note that for a node to send a message to itself does not require network communication, so A(i,i) = 0 for 1 <= i <= n. Also, you may assume that the network is undirected (messages can go in either direction with equal overhead), so that A(i,j) = A(j,i). Thus only the entries on the (strictly) lower triangular portion of A will be supplied. 

The input to your program will be the lower triangular section of A. That is, the second line of input will contain one entry, A(2,1). The next line will contain two entries, A(3,1) and A(3,2), and so on.

Output

Your program should output the minimum communication time required to broadcast a message from the first processor to all the other processors.

Sample Input

5
50
30 5
100 20 50
10 x x 10
           

Sample Output

35
           

题目翻译:

BIT最近已经交付了他们的新超级计算机,一台32处理器的阿波罗奥德赛分布式共享内存机,带有一个分层通信子系统。瓦伦丁·麦基的研究顾问杰克·斯威格特(Jack Swigert)要求她对新系统进行基准测试。

瓦伦丁对斯威格特说:“由于‘阿波罗’是一台分布式共享内存机器,内存访问和通信时间并不统一。”共享同一内存子系统的处理器之间的通信速度很快,但不属于同一子系统的处理器之间的通信速度较慢。阿波罗和我们实验室里的机器之间的通讯速度更慢了。”

“阿波罗的消息传递接口(MPI)端口是如何工作的?””Swigert问道。

”“不太好,”瓦朗蒂娜回答。' '要将消息从一个处理器广播到所有其他的n-1处理器,它们只需执行n-1发送序列。这真的会把事情序列化,降低性能。”

’‘你能做些什么来补救吗?’

“是的,”笑了情人节。”“有。一旦第一个处理器将消息发送给另一个处理器,这两个处理器就可以同时将消息发送给另外两个主机。然后会有四个主机可以发送,以此类推。”

“啊,所以你可以用二叉树来做广播!”

“并不是真正的二叉树——我们的网络有一些特殊的特性,我们应该加以利用。”我们拥有的接口卡允许每个处理器同时向连接到它的任意数量的其他处理器发送消息。然而,消息不一定同时到达目的地——这涉及到通信成本。一般来说,我们需要考虑到网络拓扑结构中每个链接的通信成本,并相应地计划将广播所需的总时间降到最低。”

输入

输入将描述连接n个处理器的网络的拓扑结构。输入的第一行将是n,处理器的数量,这样1 <= n <= 100。

其余的输入定义了一个邻接矩阵,邻接矩阵A是广场和大小n x n。它的每个条目将一个整数或字符x。(i, j)表示的价值发送消息的费用直接从节点到节点j。一个x的值(i, j)表明,一个消息不能发送直接从节点到节点j。

注意,对于一个节点发送消息给它自己并不需要网络通信,所以对于1 <= i <= n,则a (i,i) = 0。因此,只提供(严格地)A的下三角部分上的项。

程序的输入将是A的下三角部分,也就是说,第二行输入将包含一个条目A(2,1)。下一行将包含两个条目,A(3,1)和A(3,2),依此类推。

输出

程序应该输出从第一个处理器向所有其他处理器广播消息所需的最小通信时间。

思路和题目大意:

AC代码:

#include <stdio.h>
#include <algorithm>
#include <iostream>
#include <math.h>
#include <string.h>
#include <queue>
#include <stack>
#include <map>
#include <vector>
#include <stdlib.h>
#define inf 0x3f3f3f3f
#define PI 3.1415926
#define MAX 100
using namespace std;
int dp[MAX][MAX];
int dis[MAX],vis[MAX];
int n;
void dijstra()
{
    int i,j,minn,k;
    for(i=1; i<=n; i++)
    {
        dis[i]=inf;
        vis[i]=0;
    }
    dis[1]=0;
    for(i=1; i<=n; i++)
    {
        minn=inf;
        for(j=1; j<=n; j++)
        {
            if(vis[j]==0 && dis[j]<minn)
            {
                minn=dis[j];
                k=j;
            }
        }
        vis[k]=1;
        for(j=1; j<=n; j++)
            if(vis[j]==0 && dis[k]+dp[k][j]<dis[j])
                dis[j]=dis[k]+dp[k][j];
    }
}
int main()
{
    char s[10];
    int i,j;
    int ans=-1;
    while(~scanf("%d",&n))
    {
        for(i=1; i<=n; i++)
            for(j=1; j<=n; j++)
                if(i!=j)
                    dp[i][j]=inf;
                else
                    dp[i][j]=0;
        for(i=2; i<=n; i++)
            for(j=1; j<i; j++)
            {
                scanf("%s",s);
                if(s[0]!='x')
                    dp[i][j]=dp[j][i]=atoi(s);
            }
        dijstra();
        for(i=2; i<=n; i++)
            if(dis[i]>ans)
                ans=dis[i];
        printf("%d\n",ans);
    }
    return 0;
}