0 code test

// E.g. For 8 channels:
// Array Order :  0  1  2  3  4  5  6  7  8     9     10    etc. 16       etc...
// Sample Order:  A0 B0 C0 D0 E0 F0 G0 H0 A1    B1    C2    etc. A2       etc...
// Output Order:  A0 B0 C0 D0 E0 F0 G0 H0 A0+A1 B0+B1 C0+C2 etc. A0+A1+A2 etc...


#include <stdio.h>
typedef short din_t;
typedef short dout_t;
typedef int dacc_t;

#define CHANNELS 8
#define SAMPLES  4
#define N CHANNELS * SAMPLES

void array_io (dout_t d_o[N], din_t d_i[N]) {
	int i, rem;
	// Store accumulated data
	static dacc_t acc[CHANNELS];//8    初始值是0
	// Accumulate each channel
	For_Loop: for (i=0;i<N;i++) {
		rem=i%CHANNELS;//8個channel中第幾個
		acc[rem]= acc[rem] + d_i[i];//更新acc
		d_o[i] = acc[rem];
	}
}

預設是ram類型

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

1 array 接口和存儲

1.1 input array

input 作為resources，雙口ram能夠提高讀入速度

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

memory對應C類型為數組

同時讀2次

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

1.2 output array

是interface 選擇fifo，輸出是單口速度

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

fifo對應的是指針

2 數組優化

2.1 #pragma HLS ARRAY_PARTITION

數組分割可以提高吞吐量，因為通常數組是使用bram存儲，最多有兩個讀資料口，會受到限制，是以通過數組分割操作，增加多個ram，提高吞吐量。

< type >:

cyclic: 将原數組中元素循環放在cyclic個數組中

block: 将原數組中元素按順序放在cyclic個數組中

complete:将原數組中元素放在單個寄存器中

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

有多個小記憶體或多個寄存器，而不是一個大記憶體。
有效地增加了用于存儲的讀寫端口數量。
可以潛在地提高設計的吞吐量。
需要更多的記憶體執行個體或寄存器。

dim：

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

二維數組AB[6][4]

分成2個 [6][2]

2.2 #pragma ARRAY_PARTITION variable=d_i complete dim=1

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

3 使用 AXI4-Stream 最優選擇

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

C類型為指針

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

3.1 pragma HLS unroll

将循環按照factor個數并行，循環範圍縮小到循環長度/factor。能夠增加資料擷取和吞吐量

unroll是針對整個循環的疊代次數優化。

region:有這個參數，隻展開region内的for
skip_exit_check：用在factor=？未指定

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

factor分割

for(int i = 0; i < X; i++) {
  pragma HLS unroll factor=2
  a[i] = b[i] + c[i];
}

兩個兩個的取數，每組第一個是第一部分，每組第2個是第2部分

for(int i = 0; i < X; i += 2) {
  a[i] = b[i] + c[i];
  if (i+1 >= X) break;
  a[i+1] = b[i+1] + c[i+1];
}

不展開loop1

void foo(int data_in[N], int scale, int data_out1[N], int data_out2[N]) {
  int temp1[N];
  loop_1: for(int i = 0; i < N; i++) {  
    #pragma HLS unroll region
    temp1[i] = data_in[i] * scale;
      loop_2: for(int j = 0; j < N; j++) {
        data_out1[j] = temp1[j] * 123;
      }
      loop_3: for(int k = 0; k < N; k++) {
        data_out2[k] = temp1[k] * 456;
      }
  }
}

3.2 pragma HLS pipeline

流水，降低啟動間隔initiation interval（N個時鐘），每N個時鐘開始一個新的循環。預設是1。

pipeline是針對一次循環的内部去優化。

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

enable_flush：當pineline中的資料有效為低時，将暫停

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

rewind: 在隻有一個循環的結構中開始時執行一次，使得下一次疊代能夠連續

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

0 code test

1 array 接口和存儲

1.1 input array

1.2 output array

2 數組優化

2.1 #pragma HLS ARRAY_PARTITION

2.2 #pragma ARRAY_PARTITION variable=d_i complete dim=1

3 使用 AXI4-Stream 最優選擇

3.1 pragma HLS unroll

3.2 pragma HLS pipeline

繼續閱讀

FFmpeg安裝及将rtsp流轉hls協定流通過nginx進行視訊直播

html整合hls視訊流1.下載下傳hls.js2.建立一個新的html頁面3.注意事項

nginx ffmpeg讀取攝像頭RTSP轉為HLS流并定時截圖做動态封面

video.js rtmp/hls 直播流狀态監聽

videojs 直播播放HLS流 h264

優酷獲得.m3u8的方法

M3U8檔案分析

<開發筆記>頂層M3U8檔案的編寫

ts封裝，H264和aac 封裝成為ts，并生成m3u8

一鍵下載下傳M3U8/HLS 并儲存為TS檔案

Video.js 播放m3u8直播流

流媒體協定簡介寫在開始

流媒體協定

HTML頁面加載播放RTMP協定流和HLS協定流直播視訊

HLS第八課（自定義基礎圖像算法函數）

High-level Synthesis from AutoESL: A Game-changer for Chip Design

【HLS】 數組接口綜合 優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇

0 code test

1 array 接口和存儲

1.1 input array

1.2 output array

2 數組優化

2.1 #pragma HLS ARRAY_PARTITION

2.2 #pragma ARRAY_PARTITION variable=d_i complete dim=1

3 使用 AXI4-Stream 最優選擇

3.1 pragma HLS unroll

3.2 pragma HLS pipeline

繼續閱讀

【HLS】數組接口綜合優化0 code test1 array 接口和存儲2 數組優化3 使用 AXI4-Stream 最優選擇