Emgucv不完整圖像分割試驗(八)——OCR（中文字元）

2023-06-26 20:44:04

爬蟲爬取某網站圖檔後，識别圖檔中的字元。

核心沒什麼太難的内容，卡在Emgucv的draw字元身上了。

當然後來發現opencv下看别人用pil搞定了，python下也有自己的中文字庫很容易就替換了。

但Emgucv的資料太少了，最後想了一個很扯淡的招，自己畫兩張新圖，一張有顔色的作為目标（就當是水印），一張黑白的作為掩碼。最終效果如下圖所示。

Emgucv不完整圖像分割試驗(八)——OCR（中文字元）

核心代碼如下：

System.Drawing.Bitmap bmp;
                bmp = new System.Drawing.Bitmap(w, h);
                Graphics g = Graphics.FromImage(bmp);
                Font drawFont = new Font("宋體", 18, FontStyle.Bold);
                g.DrawString(ep.words , drawFont, Brushes.Red, new PointF(0, 0));
                g.Save();

                Image<Bgr, byte> temp2 = new Image<Bgr, byte>(bmp);
                Image<Gray, byte> temp3 = new Image<Gray, byte>(bmp);
                //我懶，更好的是自己再畫一張黑白的，我這就直接二值了。
                temp3 = temp3.ThresholdBinary(new Gray(1d), new Gray(255d)); 
                

                temp2.Copy(tempImage, temp3);

初開部落格，目的是交流與合作，本人QQ：273651820。

Emgucv不完整圖像分割試驗(八)——OCR（中文字元）

繼續閱讀

v2ex的簡單爬蟲

Python漫畫爬蟲開源 66漫畫 AJAX，包含資料庫連接配接，圖檔下載下傳處理

requests子產品進行人人網模拟登陸

Python image.show() 出錯FSPathMakeRef(/Applications/Preview.app) failed with error -43

2023爬蟲學習筆記 -- 多線程操作

M團店鋪評價采集不到問題問題展示：解決方案：

Python爬蟲學習（1）

Python爬蟲學習進階

Python爬蟲（入門+進階）學習筆記 1-2 初識Python爬蟲

Python進階爬蟲——Class1：認識爬蟲

python爬蟲學習筆記-1

python學習之urllib使用小結

NOIp模拟題之肮髒的牧師（桶排序）

一篇文章教你如何在一個月内學會爬取大規模資料

Pyhton爬蟲實戰 - 抓取BOSS直聘職位描述和資料清洗Pyhton爬蟲實戰 - 抓取BOSS直聘職位描述和資料清洗

sort()函數到底是怎樣進行數字排序的