天天看點

ArcEngine導出資料量較大的屬性表(ICursor接口的使用)

之前的博文“ArcEngine檢視資料量較大屬性表的解決方案”中,用了ICursor接口來實作資料量較大屬性表的分頁查詢。既然實作了分頁查詢,那麼就有可能會遇到資料量較大的屬性表導出至本地的需求(雖然我覺得這樣做意義不大,但是畢竟也算是客戶的需求)。在這種情況下,一次性導出至本地顯然是不合理的,勢必會造成記憶體溢出的異常,是以考慮用ICursor接口。在ArcObject的幫助文檔中對ICursor的說明中我看到了這樣一句話:

Cursors are forward only; they do not support backing up and retrieving rows that have already been retrieved or making multiple passes over data. If an application needs to make multiple passes over the data, the application needs to reexecute the query that returned the cursor.

大概意思就是遊标(Cursor)是單向、不可逆的,不能被檢索的,如果要檢索或者查詢那麼就要從頭來過,是以對于資料量較大的屬性表如果通過行号來檢索剛開始可能造成的時間備援不多,但是查詢越往後的分頁時間花費得越多,遠遠超出IO讀寫所需要得時間。經過測試,用之前沒有優化得代碼來導出約3萬條資料大約需要十分鐘,優化之後不到一分鐘。下面是核心代碼(excel支援的最大行數有限(65535?),是以對于資料量較大的表,導出為txt或csv是更好的選擇):

/// <summary>
        /// 分塊導出屬性表(用于屬性表屬性較大的情況)
        /// </summary>
        /// <param name="exportPath"></param>
        /// <param name="dataTable"></param>
        /// <param name="shapeType"></param>
        /// <param name="pCursor"></param>
        private bool ExportTableByBlock2(string exportPath, DataTable dataTable, string shapeType, int totalRowCount, ref ICursor pCursor)
        {
            try
            {
                int n = 0;
                IRow pRow = null;
                DataRow pDataRow = null;
                ExportDataTableToCSV(dataTable, exportPath);
                while ((pRow = pCursor.NextRow()) != null)
                {
                    ++n;
                    pDataRow = dataTable.NewRow();
                    //500條追加一次
                    if (n % 500 == 0 || n == totalRowCount)
                    {
                        for (int i = 0; i < pRow.Fields.FieldCount; i++)
                        {
                            //如果字段類型為esriFieldTypeGeometry,則根據圖層類型設定字段值
                            if (pRow.Fields.get_Field(i).Type == esriFieldType.esriFieldTypeGeometry)
                            {
                                pDataRow["Shape"] = shapeType;
                            }
                            //當圖層類型為Anotation時,要素類中會有esriFieldTypeBlob類型的資料,
                            //其存儲的是标注内容,如此情況需将對應的字段值設定為Element
                            else if (pRow.Fields.get_Field(i).Type == esriFieldType.esriFieldTypeBlob)
                            {
                                pDataRow[i] = "Element";
                            }
                            else
                            {
                                pDataRow[pRow.Fields.Field[i].Name] = pRow.get_Value(i);
                            }
                        }
                        dataTable.Rows.Add(pDataRow);
                        pDataRow = null;
                        using (StreamWriter sw = new StreamWriter(new System.IO.FileStream(exportPath, FileMode.Append, FileAccess.Write), Encoding.UTF8))
                        {
                            AppendDataTableToCSV(dataTable, exportPath, sw, n - 500);
                            sw.Close();
                        }
                        dataTable.Rows.Clear();
                        continue;
                    }

                    for (int i = 0; i < pRow.Fields.FieldCount; i++)
                    {
                        //如果字段類型為esriFieldTypeGeometry,則根據圖層類型設定字段值
                        if (pRow.Fields.get_Field(i).Type == esriFieldType.esriFieldTypeGeometry)
                        {
                            pDataRow["Shape"] = shapeType;
                        }
                        //當圖層類型為Anotation時,要素類中會有esriFieldTypeBlob類型的資料,
                        //其存儲的是标注内容,如此情況需将對應的字段值設定為Element
                        else if (pRow.Fields.get_Field(i).Type == esriFieldType.esriFieldTypeBlob)
                        {
                            pDataRow[i] = "Element";
                        }
                        else
                        {
                            pDataRow[pRow.Fields.Field[i].Name] = pRow.get_Value(i);
                        }
                    }
                    dataTable.Rows.Add(pDataRow);
                    pDataRow = null;
                }
                return true;
            }
            catch (Exception ex)
            {
                return false;
                throw ex;
            }
        }        

        /// <summary>
        /// DataTable對象導出成csv
        /// </summary>
        /// <param name="dataTable"></param>
        /// <param name="exportPath"></param>
        /// <param name="isWriteColumnName">是否寫入列名,預設寫入</param>
        public void ExportDataTableToCSV(DataTable dataTable, string exportPath, bool isWriteColumnName = true)
        {
            using (StreamWriter sw = new StreamWriter(new FileStream(exportPath, FileMode.Create, FileAccess.Write), Encoding.UTF8))
            {
                StringBuilder sb = new StringBuilder();
                if (isWriteColumnName)
                {
                    for (int i = 0; i < dataTable.Columns.Count; i++)
                    {
                        sb.Append(dataTable.Columns[i].ColumnName + ",");
                    }
                    sb.Remove(sb.Length - 1, 1);
                    sw.WriteLine(sb);
                }

                sb = new StringBuilder();
                for (int i = 0; i < dataTable.Rows.Count; i++)
                {
                    sb = new StringBuilder();
                    for (int j = 0; j < dataTable.Columns.Count; j++)
                    {
                        sb.Append(dataTable.Rows[i][j].ToString() + ",");
                    }
                    sb.Remove(sb.Length - 1, 1);
                    sw.WriteLine(sb);
                }
                sw.Close();
            }
        }

        /// <summary>
        /// 通過追加的方式将DataTable導出成xlsx檔案(适用于導出資料量較大的情況)
        /// </summary>
        /// <param name="dataTable">導出的DataTable表格(一般是大表中的子表)</param>
        /// <param name="exportPath">導出路徑</param>
        /// <param name="sw">寫入流對象,需要提前初始化</param>
        /// <param name="rowIndex">行序号(DataTable在大表中的行初始序号)</param>
        /// <param name="isWriteColumnName">是否寫入列名,第一個子表為true,後面均為false</param>
        public void AppendDataTableToCSV(DataTable dataTable, string exportPath, StreamWriter sw, int rowIndex, bool isWriteColumnName = false)
        {
            try
            {
                StringBuilder sb = new StringBuilder();
                if (isWriteColumnName)
                {
                    for (int i = 0; i < dataTable.Columns.Count; i++)
                    {
                        sb.Append(dataTable.Columns[i].ColumnName + ",");
                    }
                    sb.Remove(sb.Length - 1, 1);
                    sw.WriteLine(sb);
                }

                sb = new StringBuilder();
                for (int i = 0; i < dataTable.Rows.Count; i++)
                {
                    sb = new StringBuilder();
                    for (int j = 0; j < dataTable.Columns.Count; j++)
                    {
                        sb.Append(dataTable.Rows[i][j].ToString().Replace("\n", "").Replace("\r", "") + ",");
                    }
                    sb.Remove(sb.Length - 1, 1);
                    sw.WriteLine(sb);
                }
            }
            catch (Exception ex)
            {
                throw(ex);
            }
        }
           

大緻的思路就是先導出一個空的包含表頭的表,然後用一個引用ICursor變量來記錄已經周遊的遊标,一個數值型變量來記錄已導出的條數,每500條追加導出一次。這樣既解決了效率的問題,又不容易記憶體溢出。